Geometry-Aware Recurrent Neural Networks For Active Visual Recognition

Authors:
Ricson Cheng Carnegie Mellon University
Ziyan Wang Carnegie Mellon University
Katerina Fragkiadaki Carnegie Mellon University

Introduction:

The authors present recurrent geometry-aware neural networks that integrate visual in-formation across multiple views of a scene into 3D latent feature tensors, whilemaintaining an one-to-one mapping between 3D physical locations in the worldscene and latent feature locations.

Abstract:

We present recurrent geometry-aware neural networks that integrate visual in-formation across multiple views of a scene into 3D latent feature tensors, whilemaintaining an one-to-one mapping between 3D physical locations in the worldscene and latent feature locations. Object detection, object segmentation, and 3Dreconstruction is then carried out directly using the constructed 3D feature memory,as opposed to any of the input 2D images. The proposed models are equippedwith differentiable egomotion-aware feature warping and (learned) depth-awareunprojection operations to achieve geometrically consistent mapping between thefeatures in the input frame and the constructed latent model of the scene. Weempirically show the proposed model generalizes much better than geometry-unaware LSTM/GRU networks, especially under the presence of multiple objectsand cross-object occlusions. Combined with active view selection policies, ourmodel learns to select informative viewpoints to integrate information from by“undoing" cross-object occlusions, seamlessly combining geometry with learningfrom experience.

You may want to know: