Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement

2022 
In this letter, we propose a two-stage network for performing speech enhancement that predicts magnitude spectra in the first stage and complex spectra in the second stage. To maximize the model's performance at each stage, we propose two convolutional modules: magnitude spectral masking (MSM) and complex spectra refinement (CSR). Each module is designed to take into account the specific characteristics of the signal type it handles. The MSM estimates multiplicative masks to remove noise in the magnitude component of the convolutional features, and the CSR refines the complex component of the convolutional features using additive features. By using these modules, our proposed two-stage enhancement model shows higher performance than previously proposed state-of-the-art algorithms. In addition, the number of parameters of our model is only 2.63 million, and it can operate in real time thanks to its causal characteristics and low computational complexity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []