Random Projected Convolutional Feature for Scene Text Recognition
2016
Text recognition in natural scene image is an important yet challenging problem by its irregular nature. A novel method based on random projection and deep neural network(DNN) is proposed in this article. Firstly the word image is converted to multi-layers' convolutional neural network(CNN) feature sequence with sliding window. Then random projection(RP) is used to embed the original high-dimensional feature into a low-dimensional space. Finally, recurrent neural network(RNN) model is trained to recognize the text in word image based on the RP-CNN feature. The benefits of using RP is two-fold. It can preserve the geometrical relationship in dimension reduction, while reduce the computation and storage burden of the following RNN training effectively without much information loss. Moreover, RP brings information diversity with randomness which can improve the generation ability of original feature. Experiments show that recognition performance of RP-CNN feature, with 85% dimension reduction, is similar to the original high-dimensional ones. By ensemble of several RNN models based on various RP-CNN features, we obtain higher performance than single RNN based on original CNN feature. The proposed method shows competitive performance on public datasets such as SVT, ICDAR03, ICDAR13.
Keywords:
- Artificial intelligence
- Pattern recognition
- Machine learning
- Feature (machine learning)
- Feature (computer vision)
- Recurrent neural network
- Dimensionality reduction
- Artificial neural network
- Convolutional neural network
- Computer science
- Feature extraction
- Random projection
- Speech recognition
- Handwriting recognition
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
31
References
11
Citations
NaN
KQI