Multi-modal fusion network with complementarity and importance for emotion recognition

2023 
Multimodal emotion recognition, that is, emotion recognition uses machine learning to generate multi-modal features on the basis of videos which has become a research hotspot in the field of artificial intelligence. Traditional multi-modal emotion recognition method only simply connects multiple modalities, and the interactive utilization rate of modal information is low, and it cannot reflect the real emotion under the conflict of modal features well. This article first proves that effective weighting can improve the discrimination between modalities. Therefore, this paper takes into account the importance differences between multiple modalities, and assigns weights to them through the importance attention network. At the same time, considering that there is a certain complementary relationship between the modalities, this paper constructs an attention network with complementary modalities. Finally, the reconstructed features are fused to obtain a multi-modal feature with good interaction. The method proposed in this paper is compared with traditional methods in public datasets. The test results show that our method is accurate in It performs well in both the rate and confusion matrix metrics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []