Speech emotion recognition by using complex MFCC and deep sequential model

被引:17
|
作者
Patnaik, Suprava [1 ]
机构
[1] Kalinga Inst Ind Technol, Sch Elect, Bhubaneswar, Odisha, India
关键词
Speech emotion; MFCC; Emotion circumplex; 1-D CNN; NEURAL-NETWORKS; CLASSIFICATION; FEATURES;
D O I
10.1007/s11042-022-13725-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech Emotion Recognition (SER) is one of the front-line research areas. For a machine, inferring SER is difficult because emotions are subjective and annotation is challenging. Nevertheless, researchers feel that SER is possible because speech is quasi-stationery and emotions are declarative finite states. This paper is about emotion classification by using Complex Mel Frequency Cepstral Coefficients (c-MFCC) as the representative trait and a deep sequential model as a classifier. The experimental setup is speaker independent and accommodates marginal variations in the underlying phonemes. Testing for this work has been carried out on RAVDESS and TESS databases. Conceptually, the proposed model is erogenous towards prosody observance. The main contributions of this work are of two-folds. Firstly, introducing conception of c-MFCC and investigating it as a robust cue of emotion and there by leading to significant improvement in accuracy performance. Secondly, establishing correlation between MFCC based accuracy and Russell's emotional circumplex pattern. As per the Russell's 2D emotion circumplex model, emotional signals are combinations of several psychological dimensions though perceived as discrete categories. Results of this work are outcome from a deep sequential LSTM model. Proposed c-MFCC are found to be more robust to handle signal framing, informative in terms of spectral roll off, and therefore put forward as an input to the classifier. For RAVDESS database the best accuracy achieved is 78.8% for fourteen classes, which subsequently improved to 91.6% for gender integrated eight classes and 98.5% for affective separated six classes. Though, the RAVDESS dataset has two analogous sentences revealed results are for the complete dataset and without applying any phonetic separation of the samples. Thus, proposed method appears to be semi-commutative on phonemes. Results obtained from this study are presented and discussed in forms of confusion matrices.
引用
收藏
页码:11897 / 11922
页数:26
相关论文
共 50 条
  • [1] Speech emotion recognition by using complex MFCC and deep sequential model
    Suprava Patnaik
    [J]. Multimedia Tools and Applications, 2023, 82 : 11897 - 11922
  • [2] Emotion Recognition in Speech Using MFCC and Classifiers
    Ajitha, G.
    Prashanth, Addagatla
    Radhika, Chelle
    Chaitanya, Kancharapu
    [J]. COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING ( ICCVBIC 2021), 2022, 1420 : 197 - 207
  • [3] Speech Based Human Emotion Recognition Using MFCC
    Likitha, M. S.
    Gupta, Raksha R.
    Hasitha, K.
    Raju, A. Upendra
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 2257 - 2260
  • [4] Speech Emotion Recognition Using ANN on MFCC Features
    Dolka, Harshit
    Xavier, Arul V. M.
    Juliet, Sujitha
    [J]. ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 431 - 435
  • [5] Emotion Recognition in Speech Using MFCC and Wavelet Features
    Kishore, K. V. Krishna
    Satish, P. Krishna
    [J]. PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 842 - 847
  • [6] Speech Emotion Recognition using MFCC and Hybrid Neural Networks
    Badr, Youakim
    Mukherjee, Partha
    Thumati, Sindhu
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 366 - 373
  • [7] Speech Emotion Recognition using MFCC features and LSTM network
    Kumbhar, Harshawardhan S.
    Bhandari, Sheetal U.
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [8] Development of Speech Emotion Recognition Algorithm using MFCC and Prosody
    Koo, Hyejin
    Jeong, Soycong
    Yoon, Sungjae
    Kim, Wonjong
    [J]. 2020 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2020,
  • [9] Speech Emotion Recognition Based on Improved MFCC
    Wang, Yan
    Hu, Weiping
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
  • [10] Speech emotion recognition using MFCC-based entropy feature
    Siba Prasad Mishra
    Pankaj Warule
    Suman Deb
    [J]. Signal, Image and Video Processing, 2024, 18 : 153 - 161