Deep Representation Learning for Affective Speech Signal Analysis and Processing: Preventing unwanted signal disparities

Chi-Chun Lee,Kusha Sridhar,Jeng-Lin Li,Wei-Cheng Lin,Bo-Hao Su,Carlos Busso

Deep Representation Learning for Affective Speech Signal Analysis and Processing: Preventing unwanted signal disparities

2021

Chi-Chun Lee
Kusha Sridhar
Jeng-Lin Li
Wei-Cheng Lin
Bo-Hao Su
Carlos Busso

Speech emotion recognition (SER) is an important research area, with direct impacts in applications of our daily lives, spanning education, health care, security and defense, entertainment, and human–computer interaction. The advances in many other speech signal modeling tasks, such as automatic speech recognition, text-to-speech synthesis, and speaker identification, have led to the current proliferation of speech-based technology. Incorporating SER solutions into existing and future systems can take these voice-based solutions to the next level. Speech is a highly nonstationary signal, with dynamically evolving spatial-temporal patterns. It often requires a sophisticated representation modeling framework to develop algorithms capable of handling real-life complexities.

Keywords:

speaker identification
representation
Signal processing
Speech recognition
signal
Feature learning
signal modeling
Emotion recognition
Computer science
important research

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations