PSD and Signal Approximation-LSTM Based Speech Enhancement

2019 
Monaural speech enhancement is a challenging problem because the desired signal is estimated from singlechannel recordings. Numbers of methods have been proposed, however, due to the ignored pertinence of the specific frequency range of speech signals, the performance of the current approaches is limited. In this paper, we divide the speech mixture into two subbands and extract the desired speech signal from each frequency band based on the power spectral density (PSD) of noise mixtures. The proposed method trains two long short-term memory (LSTM) recurrent neural networks (RNNs) in parallel for the subband short time Fourier transform (STFT) of speech segments. The proposed LSTM RNN-based signal approximation (SA) method is evaluated with the IEEE and the TIMIT datasets with various noise interferences from the NOISEX dataset. The evaluation results confirm that the proposed method outperforms the state-of-the-art.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []