A Robust and Effective Text Detector Supervised by Contrastive Learning

2021 
Scene text detection is a task that detects the position of text in natural scenes. Due to the different sizes, arbitrary orientations, different colors of texts, as well as low contrast and resolution in the complex background, text detection in natural scene images is very challenging. So far, the detection results for text instances in motion blur, low-resolution images are still not satisfactory. In this paper, in order to solve the above problems, we propose an effective and robust text detection network that combines a state-of-the-art contrastive learning method SimCLR. Before being input to the feature extractor, the data is augmented in different methods, and then we calculate the similarity of the extracted corresponding feature pairs. This can significantly improve the performance of the detector in difficult conditions. We conduct a series of experiments on the public dataset ICDAR2013, ICDAR2015 and MSRA-TD500. On the ICDAR 2015 dataset, our method achieves F-measure of 0.840 and runs at 9.1 FPS at 720p resolution, demonstrating that the proposed method is effective and efficient.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    2
    Citations
    NaN
    KQI
    []