Scene text segmentation using low variation extremal regions and sorting based character grouping

2017 
Extraction of textual information from natural scene images is a challenging task due to imaging conditions and diversity of text properties. Segmentation of scene text is important step in the pipeline that significantly affects the final recognition performance. In this paper I propose a new scene text segmentation method. Firstly, a novel approach for character candidates generation based on extremal regions (ERs) is introduced. Subpaths having low area variation are extracted from ER tree. Instead of using minimum variation criterion for selection of character candidates, position of ER in extracted subpath is used as criterion for that purpose. Each subpath is represented by one ER that is sent to SVM-based classification step. After that a novel method for character candidates grouping is used to discard non-character objects that are wrongly classified as characters. Proposed approach estimates vertical positions of the lines by sorting y coordinates of region centroids and checks spatial relation of adjacent regions in the line. This step enhances precision significantly and has lower computational complexity compared to hierarchical clustering methods. Finally, the last step is restoration of character ERs erroneously eliminated by SVM classifier where text layout properties are exploited to correct false negative classifications. Experimental results obtained on the ICDAR 2013 dataset show that the proposed character candidates generation method efficiently prunes repeating regions and achieves character recall rate superior to recently published ER based method. Proposed segmentation algorithm obtains competitive performance compared to state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    10
    Citations
    NaN
    KQI
    []