A Path Signature Approach for Speech Emotion Recognition
2019
Automatic speech emotion recognition (SER) remains a
difficult task within human-computer interaction, despite increasing interest in the research community. One key challenge is how to effectively integrate short-term characterisation
of speech segments with long-term information such as temporal variations. Motivated by the numerical approximation theory of stochastic differential equations (SDEs), we propose the
novel use of path signatures. The latter provide a pathwise definition to solve SDEs, for the integration of short speech frames.
Furthermore we propose a hierarchical tree structure of path signatures, to capture both global and local information. A simple tree-based convolutional neural network (TBCNN) is used
for learning the structural information stemming from dyadic
path-tree signatures. Our experimental results on a widely
used benchmark dataset demonstrate comparable performance
to complex neural network based systems.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
27
References
15
Citations
NaN
KQI