Multiscale Deformable Attention and Multilevel Features Aggregation for Remote Sensing Object Detection

2022 
In this letter, a novel object detection method based on feature pyramid network (FPN) is proposed to improve the detection performance of remote sensing objects. First, since the information in the background regions may interfere with object detection, a novel multiscale deformable attention module (MSDAM) is designed and added on the top of the backbone of FPN to make the network suppress the background features while highlighting the target features. The proposed MSDAM generates attention maps from feature maps with multiscale deformable receptive fields, thus can fit remote sensing objects of various shapes and sizes better and predict more precise attention maps for remote sensing images. Second, in the original FPN, each proposal is predicted based on feature grids pooled from only one feature level. This process is suboptimal as the information discarded in other feature levels and the global contextual information are also meaningful to object detection. Thus, a multilevel features aggregation module (MLFAM) is proposed to aggregate the multilevel outputs of FPN and the global context of the whole image, generating more powerful pyramidal representations for the subsequent object detection. The experiments conducted on the object detection in optical remote sensing images (DIOR) and remote sensing object detection (RSOD) datasets demonstrate the superiority of the proposed method over the considered state-of-the-art baseline methods in terms of detection accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []