Random Projected Convolutional Feature for Scene Text Recognition

2016 
Text recognition in natural scene image is an important yet challenging problem by its irregular nature. A novel method based on random projection and deep neural network(DNN) is proposed in this article. Firstly the word image is converted to multi-layers' convolutional neural network(CNN) feature sequence with sliding window. Then random projection(RP) is used to embed the original high-dimensional feature into a low-dimensional space. Finally, recurrent neural network(RNN) model is trained to recognize the text in word image based on the RP-CNN feature. The benefits of using RP is two-fold. It can preserve the geometrical relationship in dimension reduction, while reduce the computation and storage burden of the following RNN training effectively without much information loss. Moreover, RP brings information diversity with randomness which can improve the generation ability of original feature. Experiments show that recognition performance of RP-CNN feature, with 85% dimension reduction, is similar to the original high-dimensional ones. By ensemble of several RNN models based on various RP-CNN features, we obtain higher performance than single RNN based on original CNN feature. The proposed method shows competitive performance on public datasets such as SVT, ICDAR03, ICDAR13.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    11
    Citations
    NaN
    KQI
    []