Dynamically locating multiple speakers based on the time-frequency domain

Hodaya Hammer,Shlomo E. Chazan,Jacob Goldberger,Sharon Gannot

Dynamically locating multiple speakers based on the time-frequency domain

2021

Hodaya Hammer
Shlomo E. Chazan
Jacob Goldberger
Sharon Gannot

In this study we present a deep neural network-based online multi-speaker localisation algorithm based on a multi-microphone array. A fully convolutional network is trained with instantaneous spatial features to estimate the direction of arrival for each time-frequency bin. The high resolution classification enables the network to accurately and simultaneously localize and track multiple speakers, both static and dynamic. Elaborated experimental study using simulated and real-life recordings in static and dynamic scenarios, demonstrates that the proposed algorithm significantly outperforms both classic and recent deep-learning-based algorithms.

Keywords:

Direction of arrival
Artificial neural network
Algorithm
time frequency domain
Microphone array
high resolution
Computer science
Bin

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations