Two-Stage Cascaded Decoder for Semantic Segmentation of RGB-D Images

2021 
Exploiting RGB and depth information can boost the performance of semantic segmentation. However, owing to the differences between RGB images and the corresponding depth maps, such multimodal information should be effectively used and combined. Most existing methods use the same fusion strategy to explore multilevel complementary information at various levels, likely ignoring different feature contributions at various levels for segmentation. To address this problem, we propose a network using a two-stage cascaded decoder (TCD), embedding a detail polishing module, to effectively integrate high- and low-level features and suppress noise from low-level details. Additionally, we introduce a depth filter and fusion module to extract informative regions from depth cues with the guidance of RGB images. The proposed TCD network achieves comparable performance to state-of-the-art RGB-D semantic segmentation methods on the benchmark NYUDv2 and SUN RGB-D datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []