Multi-level spatial attention network for image data segmentation

2021 
Deep learning models for semantic image segmentation are limited in their hierarchical architectures to extract features, which results in losing contextual and spatial information. In this paper, a new attention-based network, MSANet, which applies an encoder-decoder structure, is proposed for image data segmentation to aggregate contextual features from different levels and reconstruct spatial characteristics efficiently. To model long-range spatial dependencies among features, the multi-level spatial attention module (MSAM) is presented to process multi-level features in the encoder network and capture global contextual information. In this way, our model learns multi-level spatial dependencies between features by the MSAM and hierarchical representations of the input image by the stacked convolutional layers, which means the model is more capable of producing accurate segmentation results. The proposed network is evaluated on the PASCAL VOC 2012 and Cityscapes datasets. Results show that our model achieves excellent performance compared with U-net, FCNs, and DeepLabv3.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []