Zero-Shot Activity Recognition with Videos

Evin Pinar Ornek

Zero-Shot Activity Recognition with Videos

2020

Evin Pinar Ornek

In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloVe word embeddings. The zero-shot recognition results are evaluated by top-n accuracy. Then, the manifold learning ability is measured by mean Nearest Neighbor Overlap. In the end, we provide an extensive discussion over the results and the future directions.

Keywords:

action recognition
Manifold
Computer science
k-nearest neighbors algorithm
Nonlinear dimensionality reduction
Pattern recognition
Activity recognition
Artificial intelligence
Embedding

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations