Short segment automatic language identification using a multifeature-transition matrix approach

2003 
This paper focuses on a new technique for automatic language identification (ALID). The primary goal of this endeavor is to develop a technique which requires a minimal amount of training data and can operate on very short segments of speech which also has the flexibility to add new languages in an easy fashion. A secondary goal of this effort is to create an algorithm requiring low computation. A new approach for language identification, based on multi-feature (MF), multi-classifier (MC) transition matrices is presented. This approach not only models the static acoustic components of a language, but also the dynamics of sub-sound to sub-sound transitions within a language. The transition matrix concept not only is performance competitive with other techniques found in the literature, but also is particularly suited for the short segment problem. Closed set experiments on the 3 second segments of the 1996 NIST Language Identification Evaluation database show the MF/MC transition matrix technique performance to be promising.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []