Learning efficient single stage pedestrian detection by squeeze-and-excitation network

2021 
Pedestrian detection has a pivotal role in the field of computer vision. Recently, deep convolutional neural networks (CNNs) have been demonstrated to achieve appealing performance in object detection compared to hand-crafted methods, with single shot multiBox detector (SSD) being one of state-of-the-art methods in terms of both speed and accuracy. In this paper, we propose a novel framework which is able to perform pedestrian detection by not only considering local features but also by incorporating global information into features to make them more discriminative for this task. Specifically, we first integrate feature pyramid network into the SSD detection framework. Next, a Squeeze-and-Excitation network is proposed to encode global information. Hence, the features become more focused on pedestrians, in particular those of small scale and with occlusion. We further introduce a network in network fusion module, which enhances the features by incorporating local details. In this way our framework is able to suppress background information and highlights pedestrian elements. Experimental results show that the proposed framework can achieve comparable detection results to state-of-the-art methods and run an average of 17 frames per second (fps) on NVidia TITAN X GPU with image size of $$600\times 600$$ .
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    55
    References
    0
    Citations
    NaN
    KQI
    []