Dual-channel spectral weighting for robust speech recognition in mobile devices

Abstract Many mobile devices now include an extra microphone, frequently placed at their rear, intended to obtain information about the environmental noise for speech de-noising purposes. Although this secondary sensor can be regarded as just another element in a microphone array when performing beamforming, in this paper we show that it can be considered differently in order to better exploit the information about the acoustic environment. In particular, we propose a novel spectral weighting based on Wiener filtering that takes benefit from this secondary microphone to perform noise-robust automatic speech recognition (ASR) in mobile devices. At first it is assumed that the secondary microphone only captures noise while a reference sensor in the array (primary microphone) observes the same noise spectrum (homogeneous noise field). Since both assumptions are not always accurate, the Wiener filter (WF) weighting is modified through 1) a bias correction term (to rectify the resulting spectral weights when a non-negligible speech component is present at the secondary channel) and 2) a novel noise equalization to be applied on the secondary channel before spectral weight computation. Speech recognition experiments are performed on a dual-microphone smartphone (AURORA2-2C-CT/FT corpora) and a tablet with six microphones (CHiME-3/4). Our results show the high performance of our approach as well as its great versatility regardless of the analyzed mobile device and usage scenario.
    • Correction
    • Source
    • Cite
    • Save