Reverberation-Robust Localization of Speakers Using Distinct Speech Onsets and Multichannel Cross Correlations

2018 
Many speaker localization methods can be found in the literature. However, speaker localization under strong reverberation still remains a major challenge in the real-world applications. This paper proposes two algorithms for localizing speakers using microphone array recordings of reverberated sounds. To separate concurrent speakers, the first algorithm decomposes microphone signals spectrotemporally into subbands via an auditory filterbank. To suppress reverberation, we propose a novel speech onset detection approach derived from the speech signal and impulse response models, and further propose to formulate the multichannel cross-correlation coefficient of encoded speech onsets in each subband. The subband results are combined to estimate the directions-of-arrival of speakers. The second algorithm extends the generalized cross-correlation phase transform method by using redundant information of multiple microphones to address the reverberation problem. The proposed methods have been evaluated under adverse conditions using not only simulated signals (reverberation time $T_{60}$ of up to $1$ s) but also recordings in a real reverberant room ( $T_{60} \approx 0.65$ s). Comparing with some state-of-the-art localization methods, experimental results confirm that the proposed methods can reliably locate static and moving speakers, in presence of reverberation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    9
    Citations
    NaN
    KQI
    []