Enabling Zero-shot Multilingual Spoken Language Translation with Language-Specific Encoders and Decoders

Carlos Escolano,Marta R. Costa-jussà,José A. R. Fonollosa,Carlos Segura

Enabling Zero-shot Multilingual Spoken Language Translation with Language-Specific Encoders and Decoders

2020

Carlos Escolano
Marta R. Costa-jussà
José A. R. Fonollosa
Carlos Segura

Current end-to-end approaches to Spoken Language Translation (SLT) rely on limited training resources, especially for multilingual settings. On the other hand, Multilingual Neural Machine Translation (MultiNMT) approaches rely on higher quality and more massive data sets. Our proposed method extends a MultiNMT architecture based on language-specific encoders-decoders to the task of Multilingual SLT (MultiSLT) Our experiments on four different languages show that coupling the speech encoder to the MultiNMT architecture produces similar quality translations compared to a bilingual baseline ($\pm 0.2$ BLEU) while effectively allowing for zero-shot MultiSLT. Additionally, we propose using Adapter networks for SLT that produce consistent improvements of +1 BLEU points in all tested languages.

Keywords:

Encoder
Natural language processing
BLEU
Adapter (computing)
Architecture
spoken language translation
Artificial intelligence
Machine translation
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations