Variational mode decomposition based acoustic and entropy features for speech emotion recognition

被引:11
|
作者
Mishra, Siba Prasad [1 ]
Warule, Pankaj [1 ]
Deb, Suman [1 ]
机构
[1] Sardar Vallabhbhai Natl Inst Technol, Surat, Gujarat, India
关键词
Deep neural network; Speech emotion recognition; MFCC; Permutation entropy; Approximate entropy; APPROXIMATE ENTROPY; FEATURE-EXTRACTION; CLASSIFICATION; DEEP;
D O I
10.1016/j.apacoust.2023.109578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automated speech emotion recognition (SER) is a machine-based method for identifying emotion from speech signals. SER has many practical applications, including improving man-machine interaction (MMI), online customer support, healthcare services, online marketing, etc. Because of the wide range of applications, the popularity of SER has been increasing among researchers for three decades. Numerous studies employed various combinations of features and classifiers to improve emotion classification performance. In our study, we tried to achieve the same by using variational mode decomposition (VMD)-based features. We extracted features like MFCC, mel-spectrogram, approximate entropy (ApEn), and permutation entropy (PrEn) from each VMD mode. The performance of emotion classification is evaluated using the deep neural network (DNN) classifier and the proposed VMD-based features individually (MFCC, mel-spectrogram, ApEn, and PrEn) and in combination (MFCC + mel-spectrogram + ApEn + PrEn). We used two datasets, RAVDESS and EMO-DB, to evaluate the emotion classification performance and obtained a classification accuracy of 91.59% and 80.83% for the EMO-DB and RAVDESS datasets, respectively. Our experimental results were compared with the other methods, and we found that the proposed VMD-based feature combinations with a DNN classifier performed better than the state-of-the-art works in SER.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
    Lee, Ming-Che
    Yeh, Sheng-Cheng
    Chang, Jia-Wei
    Chen, Zhen-Yi
    SENSORS, 2022, 22 (13)
  • [22] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    ELECTRONICS, 2023, 12 (04)
  • [23] Speech Emotion Recognition Based on Acoustic Segment Model
    Zheng, Siyuan
    Du, Jun
    Zhou, Hengshun
    Bai, Xue
    Lee, Chin-Hui
    Li, Shipeng
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [24] Improving speech emotion recognition based on acoustic words emotion dictionary
    Wei, Wang
    Cao, Xinyi
    Li, He
    Shen, Lingjie
    Feng, Yaqin
    Watters, Paul A.
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (06) : 747 - 761
  • [25] Speech Emotion Recognition Using Cross-Correlation and Acoustic Features
    Chatterjee, Joyjit
    Mukesh, Vajja
    Hsu, Hui-Huang
    Vyas, Garima
    Liu, Zhen
    2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, : 243 - 249
  • [26] Speech emotion recognition using multi resolution Hilbert transform based spectral and entropy features
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    Applied Acoustics, 2025, 229
  • [27] Acoustic Features and Neural Representations for Categorical Emotion Recognition from Speech
    Keesing, Aaron
    Koh, Yun Sing
    Witbrock, Michael
    INTERSPEECH 2021, 2021, : 3415 - 3419
  • [28] Emotion recognition from telephone speech using acoustic and nonlinear features
    Bedoya-Jaramillo, S.
    Orozco-Arroyave, J. R.
    Arias-Londono, J. D.
    Vargas-Bonilla, J. F.
    2013 47TH INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2013,
  • [29] A study on emotion recognition using speech acoustic features and face images
    Son M.-J.
    Lee S.-P.
    Trans. Korean Inst. Electr. Eng., 2020, 7 (1081-1086): : 1081 - 1086
  • [30] Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition
    Yildirim, Huseyin Ediz
    Iren, Deniz
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,