Movie Emotion Estimation with Multimodal Fusion and Synthetic Data Generation

2019 
In this work, we propose a method for automatic emotion recognition from movie clips. This problem has applications in indexing and retrieval of large movie and video collections, summarization of visual content, selection of emotioninvoking materials, and such. Our approach aims to estimate valence and arousal values automatically. We extract audio and visual features, summarize them via functionals, PCA, and Fisher vector encoding approaches. We used feature selection based on canonical correlation analysis. For classification, we used extreme learning machine and support vector machine. We tested our approach on the LIRIS-ACCEDE database with ground truth annotations. The class imbalance problem was solved by generating synthetic data. By fusing the best features at score and feature level, we obtain good results on this problem, especially for the valence prediction.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []