Dewarping of document images: A semi-CNN based approach

2021 
The camera-captured digital documents may be often distorted and warped due to various document surfaces or camera angles. Also, the OCR systems find difficulty in reading such distorted images. In this paper, a framework for dewarping the images based on estimating the change of pixel-positions due to the unevenness of the surface is proposed. Here, at first, the changes of pixel-positions are measured using the warping factors, which depend on warping position and control parameters. The warping control parameters are calculated from the top and bottom text lines of the document. The warping positional parameters are estimated using the convolution neural network (CNN) that needs many images for training. Capturing such a large number of images is very difficult. For this purpose, we synthetically generated a warped document image dataset. The proposed dewarping technique works for both alphabetic and alpha-syllabary scripts. The results on Bangla (alphasyllabary) and English (alphabetic) are encouraging.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    56
    References
    0
    Citations
    NaN
    KQI
    []