Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition

Lei Kang,J. Ignacio Toledo,Pau Riba,Mauricio Villegas,Alicia Fornés,Marçal Rusiñol

Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition

2018

Lei Kang
J. Ignacio Toledo
Pau Riba
Mauricio Villegas
Alicia Fornés
Marçal Rusiñol

This paper proposes Convolve, Attend and Spell, an attention-based sequence-to-sequence model for handwritten word recognition. The proposed architecture has three main parts: an encoder, consisting of a CNN and a bi-directional GRU, an attention mechanism devoted to focus on the pertinent features and a decoder formed by a one-directional GRU, able to spell the corresponding word, character by character. Compared with the recent state-of-the-art, our model achieves competitive results on the IAM dataset without needing any pre-processing step, predefined lexicon nor language model. Code and additional results are available in https://github.com/omni-us/research-seq2seq-HTR.

Keywords:

Speech recognition
Word recognition
Spell
Language model
Convolution
Architecture
Lexicon
Encoder
Computer science
sequence model

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations