Single-shot Weakly-supervised Object Detection Guided by Empirical Saliency Model

2021 
Abstract Even though weakly-supervised object detection (WSOD) has become an effective method to relieve the heavy work of labeling, there are still difficult problems to be solved. WSOD method represented by a Multiple Instance Learning (MIL) have some common problems including running slowly and focusing on discriminative parts rather than the whole object, which will lead to false detection. To improve the efficiency and accuracy, we propose a single-shot weakly-supervised object detection model guided by empirical saliency model (SSWOD). As human vision always focuses on the most attracting parts of the image, saliency maps can usually guide our model to locate the most promising object areas. By this way, our model takes the saliency areas as pseudo ground-truths to realize the WSOD task with only class labels. Moreover, empirical saliency is designed to refine the pseudo ground-truth and improve the detection. Our new framework not only realizes a one-step detection without region proposals, but also reduces computational consumption. Experiments on PASCAL VOC 2007 & 2012 benchmarks demonstrate that SSWOD is 8 times faster and 5 times smaller than previous approaches, surpassing the state-of-the-art WSOD methods by 6.1% mean average precision (mAP).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []