Speaker Recognition from Distance Using X-Vectors with Reverberation-Robust Features

2019 
When speaker is recorded from distance, a microphone captures speech signal corrupted by room reverberation and environmental noise. In this paper, we aim to reduce the effect of detrimental late reverberation and noise on speaker recognition based on x-vector extractor. We investigate features that are robust against room reverberation and dedicated Probabilistic Linear Discriminant Analysis (PLDA) training. The results of performed experiments indicate that an improvement in the performance of a speaker embeddings system is mostly brought about by careful PLDA training with data adjusted to the test conditions. An improvement in reverberant conditions is also observed for the reverberation robust features, such as Mel-frequency Cepstral Coefficients (MFCC) from the dereverberated signal and Mean Hilbert Envelope Coefficients (MHEC). This work has been performed as part of our contribution to the Voices Obscured in Complex Environmental Settings (VOiCES) from a Distance Challenge 2019, in which the reported methods led to a 12% and 32% relative gains in terms of the Equal Error Rate for the VOiCES development and evaluation datasets, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []