I-vector based deep neural network acoustic model adaptation using multilingual language resource

2016 
I-vector adaptation of DNN-HMM acoustic models has shown clear performance improvement for speech recognition. In this paper, we study this technique on Babel task. we use Swahili as target language (training data of 50 hours) and another 6 languages as multilingual resources to train i-vector extractors respectively. Our study shows that i-vector extractors trained with more multilingual data only produce slightly improved results. Moreover, we compared two i-vectors adaptation methods, 1) concatenate i-vectors with spectral features; 2) predict a bias term adding it to spectral features from i-vectors using a NN. When DNN is trained from scratch, the two methods perform similarly. However, only the second method is appropriate in a cross-lingual transfer learning scenario. We investigate it as well, and results show further word error rate reduction can be gained.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []