See And Think: Disentangling Semantic Scene Completion

Authors:
Shice Liu Institute of Computing Technology, Chinese Academy of Sciences
YU HU Institute of Computing Technology, Chinese Academy of Sciences
Yiming Zeng Institute of Computing Technology Chinese Academy of Sciences
Qiankun Tang Institute of Computing Technology, Chinese Academy of Sciences
Beibei Jin Institute of Computing Technology, Chinese Academy of Sciences
Yinhe Han Institute of Computing Technology, Chinese Academy of Sciences
Xiaowei Li Institute of Computing Technology, Chinese Academy of Sciences

Introduction:

Semantic scene completion predicts volumetric occupancy and object category of a 3D scene, which helps intelligent agents to understand and interact with the surroundings.In this work, the authors propose a disentangled framework, sequentially carrying out 2D semantic segmentation, 2D-3D reprojection and 3D semantic scene completion.

Abstract:

Semantic scene completion predicts volumetric occupancy and object category of a 3D scene, which helps intelligent agents to understand and interact with the surroundings. In this work, we propose a disentangled framework, sequentially carrying out 2D semantic segmentation, 2D-3D reprojection and 3D semantic scene completion. This three-stage framework has three advantages: (1) explicit semantic segmentation significantly boosts performance; (2) flexible fusion ways of sensor data bring good extensibility; (3) progress in any subtask will promote the holistic performance. Experimental results show that regardless of inputing a single depth or RGB-D, our framework can generate high-quality semantic scene completion, and outperforms state-of-the-art approaches on both synthetic and real datasets.

You may want to know: