Dual-Channel VTS Feature Compensation with Improved Posterior Estimation
The use of dual-microphones is a powerful tool for noise- robust automatic speech recognition (ASR). In particular, it allows the reformulation of classical techniques like vector Taylor series (VTS) feature compensation. In this work, we consider a critical issue of VTS compensation such as posterior computation and propose an alternative way to estimate more accurately these probabilities when VTS is applied to enhance noisy speech captured by dual-microphone mobile devices. Our proposal models the conditional dependence of a noisy secondary channel given a primary one not only to outperform single-channel VTS feature compensation, but also a previous dual-channel VTS approach based on a stacked formulation. This is confirmed by recognition experiments on two different dual-channel extensions of the Aurora-2 corpus. Such extensions emulate the use of a dual-microphone smartphone in close- and far-talk conditions, obtaining our proposal relevant improvements in the latter case.