Analysis of keyword spotting performance across IARPA babel languages

William Hartmann,Damianos Karakos,Roger Hsiao,Le Zhang,Tanel Alumäe,Stavros Tsakalidis,Richard M. Schwartz

Analysis of keyword spotting performance across IARPA babel languages

2017

William Hartmann
Damianos Karakos
Roger Hsiao
Le Zhang
Tanel Alumäe
Stavros Tsakalidis
Richard M. Schwartz

With the completion of the IARPA Babel program, it is possible to systematically analyze the performance of speech recognition systems across a wide variety of languages. We select 16 languages from the dataset and compare performance using a deep neural network-based acoustic model. The focus is on keyword spotting using the actual term-weighted value (ATWV) metric. We demonstrate that ATWV is keyword dependent, and that this must be accounted for in any cross-language analysis. Further, we show that while performance across languages does not track with any particular feature of the language, it is correlated with inter-annotator agreement.

Keywords:

Training set
Keyword spotting
Decoding methods
Speech recognition
Artificial neural network
Acoustic model
Computer science
Natural language processing
Artificial intelligence

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations