PSD and Signal Approximation-LSTM Based Speech Enhancement

Yi Li,Yang Sun,Syed Mohsen Naqvi

PSD and Signal Approximation-LSTM Based Speech Enhancement

2019

Monaural speech enhancement is a challenging problem because the desired signal is estimated from singlechannel recordings. Numbers of methods have been proposed, however, due to the ignored pertinence of the specific frequency range of speech signals, the performance of the current approaches is limited. In this paper, we divide the speech mixture into two subbands and extract the desired speech signal from each frequency band based on the power spectral density (PSD) of noise mixtures. The proposed method trains two long short-term memory (LSTM) recurrent neural networks (RNNs) in parallel for the subband short time Fourier transform (STFT) of speech segments. The proposed LSTM RNN-based signal approximation (SA) method is evaluated with the IEEE and the TIMIT datasets with various noise interferences from the NOISEX dataset. The evaluation results confirm that the proposed method outperforms the state-of-the-art.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations