Analysis of keyword spotting performance across IARPA babel languages

2017 
With the completion of the IARPA Babel program, it is possible to systematically analyze the performance of speech recognition systems across a wide variety of languages. We select 16 languages from the dataset and compare performance using a deep neural network-based acoustic model. The focus is on keyword spotting using the actual term-weighted value (ATWV) metric. We demonstrate that ATWV is keyword dependent, and that this must be accounted for in any cross-language analysis. Further, we show that while performance across languages does not track with any particular feature of the language, it is correlated with inter-annotator agreement.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    6
    Citations
    NaN
    KQI
    []