Combining Multi-perspective Attention Mechanism with Convolutional Networks for Monaural Speech Enhancement

Tian Lan,Yilan Lyu,Wenzheng Ye,Guoqiang Hui,Zenglin Xu,Qiao Liu

Combining Multi-perspective Attention Mechanism with Convolutional Networks for Monaural Speech Enhancement

2020

The redundant convolutional encoder-decoder network has been proven useful in speech enhancement tasks. This network can capture the localized time-frequency details of speech signals through the fully convolutional network structure and the feature selection capability that results from the encoder-decoder mechanism. However, extracting informative features, which we regard as important for the representational capability of speech enhancement models, is not considered explicitly. To solve this problem, we introduce the attention mechanism into the convolutional encoder-decoder model to explicitly emphasize useful information from three aspects, namely, channel, space, and concurrent space-and-channel. Furthermore, the attention operation is specifically achieved through the squeeze-and-excitation mechanism and its variants. The model can adaptively emphasize valuable information and suppress useless ones by assigning weights from different perspectives according to global information, thereby improving its representational capability. Experimental results show that the proposed attention mechanisms can employ a small fraction of parameters to effectively improve the performance of CNN-based models compared with their normal versions, and generalize well to unseen noises, signal-to-noise ratios (SNR) and speakers. Among these mechanisms, the concurrent space-channel-wise attention exhibits the most significant improvement. And when comparing with the state-of-the-art, they can produce comparable or better results. We also integrate the proposed attention mechanisms with other convolutional neural network (CNN)-based models and gain performance. Moreover, we visualize the enhancement results to show the effect of the attention mechanisms more clearly.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations