Arabic Optical Character Recognition Using Attention Based Encoder-Decoder Architecture

2020 
Optical character recognition (OCR) systems are used to convert scanned documents into text. Arabic OCR is an active area of research where high accuracy is demanding. This paper focuses on building a model for converting images that contain Arabic text into their corresponding text using a deep learning approach. This model does not require any knowledge of the underlying language and it is simply trained end-to-end on the KAFD dataset. It combines several standard neural components from vision and natural language processing. Features are extracted from images using Convolutional Neural Networks (CNNs) where the features are arranged in a grid. Each row is then encoded using a Recurrent Neural Networks (RNNs). An RNN decoder with a visual attention mechanism is used to generate the output text. Our preliminary experiments show that the presented approach is effective. The overall obtained accuracy is 89.82%. However, the individual results for some fonts are higher than this score.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    0
    Citations
    NaN
    KQI
    []