Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引:29
|
作者
Agarwal, Gaurav [1 ]
Om, Hari [1 ]
机构
[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
关键词
Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;
D O I
10.1007/s11042-020-10118-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.
引用
收藏
页码:9961 / 9992
页数:32
相关论文
共 50 条
  • [31] The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition
    Uddin, Mohammad Amaz
    Chowdury, Mohammad Salah Uddin
    Khandaker, Mayeen Uddin
    Tamam, Nissren
    Sulieman, Abdelmoneim
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 1709 - 1722
  • [32] Language dialect based speech emotion recognition through deep learning techniques
    Rajendran, Sukumar
    Mathivanan, Sandeep Kumar
    Jayagopal, Prabhu
    Venkatasen, Maheshwari
    Pandi, Thanapal
    Sorakaya Somanathan, Manivannan
    Thangaval, Muthamilselvan
    Mani, Prasanna
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (03) : 625 - 635
  • [33] Speech Emotion Recognition Based on Learning Automata in
    Motamed, Sara
    Setayeshi, Saeed
    Farhoudi, Zeinab
    Ahmadi, Ali
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2014, 12 (03): : 173 - 185
  • [34] Emotion Regulation and Performance Enhancement in College Athletes Based on Emotion Recognition and Deep Learning
    Li, Haiqing
    REVISTA INTERNACIONAL DE MEDICINA Y CIENCIAS DE LA ACTIVIDAD FISICA Y DEL DEPORTE, 2022, 22 (86): : 476 - 492
  • [35] Speech Emotion Recognition Using Deep Learning Techniques: A Review
    Khalil, Ruhul Amin
    Jones, Edward
    Babar, Mohammad Inayatullah
    Jan, Tariqullah
    Zafar, Mohammad Haseeb
    Alhussain, Thamer
    IEEE ACCESS, 2019, 7 : 117327 - 117345
  • [36] Data Augmentation Techniques for Speech Emotion Recognition and Deep Learning
    Antonio Nicolas, Jose
    de Lope, Javier
    Grana, Manuel
    BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II, 2022, 13259 : 279 - 288
  • [37] Emotion recognition from speech using deep learning on spectrograms
    Li, Xingguang
    Song, Wenjun
    Liang, Zonglin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (03) : 2791 - 2796
  • [38] Speech Emotion Recognition Using Deep Learning on audio recordings
    Suganya, S.
    Charles, E. Y. A.
    2019 19TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER - 2019), 2019,
  • [39] Transfer Learning of Deep Neural Network for Speech Emotion Recognition
    Huang, Ying
    Hu, Mingqing
    Yu, Xianguo
    Wang, Tao
    Yang, Chen
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 721 - 729
  • [40] A deep interpretable representation learning method for speech emotion recognition
    Jing, Erkang
    Liu, Yezheng
    Chai, Yidong
    Sun, Jianshan
    Samtani, Sagar
    Jiang, Yuanchun
    Qian, Yang
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)