Multimodal and Multiresolution Depression Detection from Speech and Facial Landmark Features

2016 
Automatic classification of depression using audiovisual cues can help towards its objective diagnosis. In this paper, we present a multimodal depression classification system as a part of the 2016 Audio/Visual Emotion Challenge and Workshop (AVEC2016). We investigate a number of audio and video features for classification with different fusion techniques and temporal contexts. In the audio modality, Teager energy cepstral coefficients~(TECC) outperform standard baseline features; while the best accuracy is achieved with i-vector modelling based on MFCC features. On the other hand, polynomial parameterization of facial landmark features achieves the best performance among all systems and outperforms the best baseline system as well.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    57
    Citations
    NaN
    KQI
    []