Syntax-Based Attention Masking for Neural Machine Translation.

Colin McDonald,David Chiang

Syntax-Based Attention Masking for Neural Machine Translation.

2021

Colin McDonald
David Chiang

We present a simple method for extending transformers to source-side trees. We define a number of masks that limit self-attention based on relationships among tree nodes, and we allow each attention head to learn which mask or masks to use. On translation from English to various low-resource languages, and translation in both directions between English and German, our method always improves over simple linearization of the source-side parse tree and almost always improves over a sequence-to-sequence baseline, by up to +2.1 BLEU.

Keywords:

Parse tree
Syntax (programming languages)
Computer science
Artificial intelligence
Machine translation
Natural language processing
head
Translation (geometry)
Tree (data structure)
Limit (mathematics)
Linearization

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations