Improvements on transducing syllable lattice to word lattice for keyword search

2015 
This paper investigates a weighted finite state transducer (WFST) based syllable decoding and transduction method for keyword search (KWS), and compares it with sub-word search and phone confusion methods in detail. Acoustic context dependent phone models are trained from word forced alignments and then used for syllable decoding and lattice generation. Out-of-vocabulary (OOV) keyword pronunciations are produced using a grapheme-to-syllable (G2S) system and then used to construct a lexical transducer. The lexical transducer is then composed with a keyword-boosted language model (LM) to transduce the syllable lattices to word lattices for final KWS. Word Error Rates (WER) and KWS results are reported for 5 different languages. It is shown that the syllable transduction method gives comparable KWS results to the syllable search and phone confusion methods. Combination of these three methods further improves OOV KWS performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    5
    Citations
    NaN
    KQI
    []