Attention Mechanism for Neural Machine Translation: A survey

2021 
Machine translation, one of the most basic challenges in natural language processing, aims to automated translate between human language. Neural machine translation has emerged as a very successful paradigm which can learn features directly from data and has led to remarkable breakthrough in the field of machine translation. But widely adopt encoder-decoder model architecture in neural machine translation has been faced effectiveness issue and training issue. Currently, almost all state-of-the-art neural machine translation models are based on attention mechanism. Attention models help relate input sequence units disregarding distance between them in space and time, at the same time make sequence data processing more parallelizable. Therefore, the introduction of attention greatly improves the performance of neural machine translation model. Given this period of rapid development, the goal of this paper is to provide a comprehensive summary of the state-of-art attention model. Most widely adopt attention models and their invariants are covering in this paper, including but not limited to self-attention, soft attention, hard attention, local attention, global attention, additive attention, multiplicative attention, key-value attention and their realization in various sequence-to-sequence models. I finish the survey by identifying promising directions for future research.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    0
    Citations
    NaN
    KQI
    []