Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

2020 
This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    1
    Citations
    NaN
    KQI
    []