$F_0$ -Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model

2021 
This paper proposes a robust automatic speech analysis method based on a source-filter model constructed of an Auto-Regressive eXogenous (ARX) model and the Liljencrants-Fant (LF) model. The proposed method estimates glottal source waveform and vocal tract shape parameters using an analysis-by-synthesis approach. Structurally, the first step is to initialize the glottal source parameters using the inverse filter method, and the second step is to simultaneously estimate the glottal source waveform and the vocal tract shape parameters using an analysis-by-synthesis approach with an iterative algorithm. The proposed method was verified on synthetic voices with different glottal noise (signal to noise ratio) from 0 dB to 50 dB and different fundamental frequency ( $F_0$ ) from 80 Hz to 320 Hz levels. The results show that the proposed method achieved a much higher estimation accuracy than that of the state-of-the-art inverse filtering methods on both different glottal noise and different $F_0$ levels.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    0
    Citations
    NaN
    KQI
    []