A Robust Progressive Text Line Segmentation Framework with Markov Line Descriptors

2020 
Text line segmentation (TLS) continues to be an unsolved problem in the field of document analysis, and this is especially true for lines of handwritten characters. In recent years, however, this challenging problem has seen a surge of interest. In this paper we introduce our work on a new robust framework for solving the TLS problem via a trial-and-error method within a divide-and-conquer paradigm. Specifically, we propose a set of candidate solutions from constrained optimal Markov line descriptors (COMLD) and then predict a candidate class using a classifier learned from training data. Our new framework will treat a multiple-line candidate as a new and independent TLS problem, and it will re-estimate method parameters to achieve better adaptations and thus better solutions. Our extensive experimental results indicate that this framework 1) reaches or outperforms state-of-the-art TLS solutions on various datasets; 2) is robust against various image variations including, but not limited to, rotation, noise, image, resolution, and text line spaces; and 3) is trainable for a new language/data and is self-adaptive to new testing data. This evidence suggests that our methodology is a highly promising solution to the TLS problem.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []