ASPP-DF-PVNet: Atrous Spatial Pyramid Pooling and Distance-Filtered PVNet for occlusion resistant 6D object pose estimation

2021 
Abstract Detecting objects and estimating their 6D poses from a single RGB image is quite challenging under severe occlusions. Recently, vector-field based methods have shown certain robustness to occlusion and truncation. Based on the vector-field representation, applying voting strategy to localize 2D keypoints can further reduce the influence of outliers. To improve the effectiveness of vector-field based deep network and voting scheme, we propose Atrous Spatial Pyramid Pooling and Distance-Filtered PVNet (ASPP-DF-PVNet), an occlusion resistant framework for 6D object pose estimation. ASPP-DF-PVNet utilizes the effective Atrous Spatial Pyramid Pooling (ASPP) module of Deeplabv3 to capture multi-scale features and encode global context information, which improves the accuracy of segmentation and vector-field prediction comparing to the original PVNet, especially under severe occlusions. Considering that the distances between pixels and keypoint hypotheses will affect the voting deviations, we then present a distance-filtered voting scheme which takes the voting distances into consideration to filter out the votes with large deviations. Experiments demonstrate that our method outperforms the state-of-the-art methods by a considerable margin without using pose refinement, and obtains competitive results against the methods with refinement on the LINEMOD and Occlusion LINEMOD datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []