Confidence measures for CTC-based phone synchronous decoding

2017 
Connectionist Temporal Classification (CTC) model has achieved state-of-the-art LVCSR performance. However, due to the introduction of the blank symbol, word-level confidence measures (CM) based on CTC model can not be easily calculated by directly using the traditional phone posterior normalization or confusion network (CN) approaches. Recently, a phone synchronous decoding (PSD) framework has been proposed for efficient decoding with CTC model. By automatically ignoring blank frames, PSD decoding not only achieves significant speed-up, but also yields highly compact and precise CTC phone lattices. In this work, two CM generation approaches on top of the PSD CTC lattice are proposed. Detailed investigation is also carried out to demonstrate the effectiveness of PSD CTC lattice. Experiments on an English switchboard LVCSR task showed that the performance of the proposed PSD CTC lattice based CM can significantly outperform the CM based on traditional frame synchronous decoding with CTC or HMM models.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    10
    Citations
    NaN
    KQI
    []