Novel acoustic features for speech emotion recognition

被引:9
|
作者
Roh Yong-Wan [1 ]
Kim Dong-Ju [1 ]
Lee Woo-Seok [2 ]
Hong Kwang-Seok [1 ]
机构
[1] Sungkyunkwan Univ, Sch Informat & Commun Engn, Suwon 440746, Kyungki Do, South Korea
[2] Anyang Trade Ctr, Dev Div, DSP Dev Team, Anyang 431050, South Korea
来源
关键词
speech emotion recognition; MFB spectral entropy; entropy; emotion recognition; rejection;
D O I
10.1007/s11431-009-0204-3
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech. The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform (FFT) spectral entropy, delta FFT spectral entropy, Mel-frequency filter bank (MFB) spectral entropy, and Delta MFB spectral entropy. Spectral-based entropy features are simple. They reflect frequency characteristic and changing characteristic in frequency of speech. We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores. This reduces the false recognition rate to improve overall performance. Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results, respectively. These scores are first obtained from a pattern recognition procedure. The pattern recognition phase uses the Gaussian mixture model (GMM). We classify the four emotional states as anger, sadness, happiness and neutrality. The proposed method is evaluated using 45 sentences in each emotion for 30 subjects, 15 males and 15 females. Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy, Zero Crossing Rate (ZCR), linear prediction coefficient (LPC), and pitch parameters. We demonstrate the effectiveness of the proposed approach. One of the proposed features, combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods. We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score.
引用
收藏
页码:1838 / 1848
页数:11
相关论文
共 50 条
  • [31] Exploiting the potentialities of features for speech emotion recognition
    Li, Dongdong
    Zhou, Yijun
    Wang, Zhe
    Gao, Daqi
    [J]. INFORMATION SCIENCES, 2021, 548 : 328 - 343
  • [32] Adding dimensional features for emotion recognition on speech
    Ben Letaifa, Leila
    Ines Torres, Maria
    Justo, Raquel
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,
  • [33] Significance of Phonological Features in Speech Emotion Recognition
    Wang, Wei
    Watters, Paul A.
    Cao, Xinyi
    Shen, Lingjie
    Li, Bo
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 633 - 642
  • [34] Speech emotion recognition: Features and classification models
    Chen, Lijiang
    Mao, Xia
    Xue, Yuli
    Cheng, Lee Lung
    [J]. DIGITAL SIGNAL PROCESSING, 2012, 22 (06) : 1154 - 1160
  • [35] Statistical Evaluation of Speech Features for Emotion Recognition
    Iliou, Theodoros
    Anagnostopoulos, Christos-Nikolaos
    [J]. ICDT: 2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL TELECOMMUNICATIONS, 2009, : 121 - 126
  • [36] Speech Emotion Recognition Based on Arabic Features
    Meddeb, Mohamed
    Karray, Hichem
    Alimi, Adel M.
    [J]. 2015 15TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2015, : 46 - 51
  • [37] Hybrid Spectral Features for Speech Emotion Recognition
    Shah, Firoz A.
    Anto, Babu P.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [38] Voice Quality Features for Speech Emotion Recognition
    Idris, Inshirah
    Salam, Md Sah Hj
    [J]. JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2015, 10 (04): : 183 - 191
  • [39] Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review
    Mohmad Dar, G.H.
    Delhibabu, Radhakrishnan
    [J]. IEEE Access, 2024, 12 : 151122 - 151152
  • [40] Speech Emotion Recognition Using Novel HHT-TEO Based Features
    Xiang, Li
    Xin, Li
    [J]. JOURNAL OF COMPUTERS, 2011, 6 (05) : 989 - 998