Recognizing Actions in Videos from Unseen Viewpoints.

Aj Piergiovanni,Michael S. Ryoo

Recognizing Actions in Videos from Unseen Viewpoints.

2021

Aj Piergiovanni
Michael S. Ryoo

Standard methods for video recognition use large CNNs designed to capture spatio-temporal data. However, training these models requires a large amount of labeled training data, containing a wide variety of actions, scenes, settings and camera viewpoints. In this paper, we show that current convolutional neural network models are unable to recognize actions from camera viewpoints not present in their training data (i.e., unseen view action recognition). To address this, we develop approaches based on 3D representations and introduce a new geometric convolutional layer that can learn viewpoint invariant representations. Further, we introduce a new, challenging dataset for unseen view recognition and show the approaches ability to learn viewpoint invariant representations.

Keywords:

standard methods
action recognition
Computer science
Variety (cybernetics)
Artificial intelligence
Viewpoints
Invariant (computer science)
Layer (object-oriented design)
Training set
Convolutional neural network
Machine learning

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations