Laplacian Feature Pyramid Network for Object Detection in VHR Optical Remote Sensing Images

Except for multiscale features, high-frequency features are also crucial for the identification of many objects in object detection for very high resolution optical remote sensing (VHR-ORS) images but have not been considered yet. Due to the fact that the Laplacian pyramid consists of high-frequency information at each level, we propose a Laplacian feature pyramid (FP) network (LFPN) considering both low-frequency features and high-frequency features based on FP structure to improve the object detection performance of VHR-ORS images. FP-based structures are efficient to represent multiscale features. But, in general, FP-based structures, high-frequency features are not specially considered. Such high-frequency features are important to distinguish many ground objects with sufficient details. For example, texture features are critical to distinguish basketball_court and tennis_court. The construction of LFPN consists of a bottom-up pathway, Laplacian pathway, and a fusion pathway, which generate low-frequency pyramid, high-frequency pyramid, and compound pyramid, respectively. The bottom-up pathway follows the computation flow of the backbone convolutional neural networks (CNNs) which is similar to general FP-based structures. The Laplacian pathway extracts the high-frequency features of objects through a trainable Laplacian operator. Finally, the low-frequency and high-frequency FPs are fused to generate the compound pyramid in efficient ways. To evaluate the performance of LFPN, we embed LFPN into both two-stage object detection (T-LFPN) systems and single-stage object detection (S-LFPN) systems to conduct experiments. Experiments on a public challenging ten-class data set NWPU VHR-10 demonstrate the superior performance of LFPN in both T-LFPN and S-LFPN systems and state-of-the-art performance of LFPN-based detectors.
    • Correction
    • Source
    • Cite
    • Save