Sentence-State LSTMs For Sequence-to-Sequence Learning.

Xuefeng Bai,Yafu Li,Zhirui Zhang,Mingzhou Xu,Boxing Chen,Weihua Luo,Derek F. Wong,Yue Zhang

Sentence-State LSTMs For Sequence-to-Sequence Learning.

2021

Xuefeng Bai
Yafu Li
Zhirui Zhang
Mingzhou Xu
Boxing Chen
Weihua Luo
Derek F. Wong
Yue Zhang

Transformer is currently the dominant method for sequence to sequence problems. In contrast, RNNs have become less popular due to the lack of parallelization capabilities and the relatively lower performance. In this paper, we propose to use a parallelizable variant of bi-directional LSTMs (BiLSTMs), namely sentence-state LSTMs (S-LSTM), as an encoder for sequence-to-sequence tasks. The complexity of S-LSTM is only \(\mathcal {O}(n)\) as compared to \(\mathcal {O}(n^2)\) of Transformer. On four neural machine translation benchmarks, we empirically find that S-SLTM can achieve significantly better performances than BiLSTM and convolutional neural networks (CNNs). When compared to Transformer, our model gives competitive performance while being 1.6 times faster during inference.

Keywords:

Sequence
Encoder
Natural language processing
Parallelizable manifold
Algorithm
Sequence learning
Computer science
Machine translation
transformer
Convolutional neural network
Inference
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations