Variational mode decomposition based acoustic and entropy features for speech emotion recognition

被引:11
|
作者
Mishra, Siba Prasad [1 ]
Warule, Pankaj [1 ]
Deb, Suman [1 ]
机构
[1] Sardar Vallabhbhai Natl Inst Technol, Surat, Gujarat, India
关键词
Deep neural network; Speech emotion recognition; MFCC; Permutation entropy; Approximate entropy; APPROXIMATE ENTROPY; FEATURE-EXTRACTION; CLASSIFICATION; DEEP;
D O I
10.1016/j.apacoust.2023.109578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automated speech emotion recognition (SER) is a machine-based method for identifying emotion from speech signals. SER has many practical applications, including improving man-machine interaction (MMI), online customer support, healthcare services, online marketing, etc. Because of the wide range of applications, the popularity of SER has been increasing among researchers for three decades. Numerous studies employed various combinations of features and classifiers to improve emotion classification performance. In our study, we tried to achieve the same by using variational mode decomposition (VMD)-based features. We extracted features like MFCC, mel-spectrogram, approximate entropy (ApEn), and permutation entropy (PrEn) from each VMD mode. The performance of emotion classification is evaluated using the deep neural network (DNN) classifier and the proposed VMD-based features individually (MFCC, mel-spectrogram, ApEn, and PrEn) and in combination (MFCC + mel-spectrogram + ApEn + PrEn). We used two datasets, RAVDESS and EMO-DB, to evaluate the emotion classification performance and obtained a classification accuracy of 91.59% and 80.83% for the EMO-DB and RAVDESS datasets, respectively. Our experimental results were compared with the other methods, and we found that the proposed VMD-based feature combinations with a DNN classifier performed better than the state-of-the-art works in SER.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] RECOGNITION OF EMOTION IN SPEECH USING VARIOGRAM BASED FEATURES
    Esmaileyan, Zeynab
    Marvi, Hosein
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2014, 27 (03) : 156 - 170
  • [32] A Subset of Acoustic Features for Machine Learning-based and Statistical Approaches in Speech Emotion Recognition
    Costantini, Giovanni
    Cesarini, Valerio
    Casali, Daniele
    BIOSIGNALS: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 4: BIOSIGNALS, 2022, : 257 - 264
  • [33] FUSION APPROACHES FOR EMOTION RECOGNITION FROM SPEECH USING ACOUSTIC AND TEXT-BASED FEATURES
    Pepino, Leonardo
    Riera, Pablo
    Ferrer, Luciana
    Gravano, Agustin
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6484 - 6488
  • [34] Speech Emotion Recognition Based on Self-Attention Weight Correction for Acoustic and Text Features
    Santoso, Jennifer
    Yamada, Takeshi
    Ishizuka, Kenkichi
    Hashimoto, Taiichi
    Makino, Shoji
    IEEE ACCESS, 2022, 10 : 115732 - 115743
  • [35] Jamming Recognition Algorithm Based on Variational Mode Decomposition
    Zhou, Hongping
    Wang, Ziwei
    Wu, Ruowu
    Xu, Xiong
    Guo, Zhongyi
    IEEE SENSORS JOURNAL, 2023, 23 (15) : 17341 - 17349
  • [36] Speech Emotion Recognition Based on Entropy of Enhanced Wavelet Coefficients
    Sultana, S.
    Shahnaz, C.
    Fattah, S. A.
    Ahmmed, I.
    Zhu, W. -P.
    Ahmad, M. O.
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 137 - 140
  • [37] Emotion classification from speech signal based on empirical mode decomposition and non-linear featuresSpeech emotion recognition
    Palani Thanaraj Krishnan
    Alex Noel Joseph Raj
    Vijayarajan Rajangam
    Complex & Intelligent Systems, 2021, 7 : 1919 - 1934
  • [38] Dysarthric Speech Recognition Using Variational Mode Decomposition and Convolutional Neural Networks
    R. Rajeswari
    T. Devi
    S. Shalini
    Wireless Personal Communications, 2022, 122 : 293 - 307
  • [39] Dysarthric Speech Recognition Using Variational Mode Decomposition and Convolutional Neural Networks
    Rajeswari, R.
    Devi, T.
    Shalini, S.
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 122 (01) : 293 - 307
  • [40] Recognition of denatured biological tissue based on variational mode decomposition and multi-scale permutation entropy
    Liu Bei
    Hu Wei-Peng
    Zou Xiao
    Ding Ya-Jun
    Qian Sheng-You
    ACTA PHYSICA SINICA, 2019, 68 (02)