Role of voice activity detection methods for the speakers in the wild challenge

2017 
One of the major reasons for the performance degradation of a speaker verification (SV) system in real-world conditions is its inability to spot speech regions due to the presence of noise. This work focuses on the role of voice activity detection (VAD) methods in alleviating such shortcomings. The experiments are conducted on the core-core task of the speakers in the wild (SITW) challenge. Two VAD approaches are explored in this work. One of them is the recently proposed self-adaptive VAD and the other is based on vowel-like region (VLR) detection. For evaluating the effectiveness of these approaches, the SV systems are developed using the i-vector framework in the front-end and probabilistic linear discriminant analysis (PLDA) in the back-end. The self-adaptive VAD based system shows better performance compared to the VLR based system in high SNR condition. Under degraded conditions, the VLR based method is relatively more robust compared to self-adaptive VAD. Exploiting these complementary features, significant improvements in the SV performances are noted with the fusion of scores of the two systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []