Novel acoustic features for speech emotion recognition

被引：9

作者：

Roh Yong-Wan ^{[1
]}

Kim Dong-Ju ^{[1
]}

Lee Woo-Seok ^{[2
]}

Hong Kwang-Seok ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Sch Informat & Commun Engn, Suwon 440746, Kyungki Do, South Korea

[2] Anyang Trade Ctr, Dev Div, DSP Dev Team, Anyang 431050, South Korea

来源：

SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES | 2009年 / 52卷 / 07期

关键词：

speech emotion recognition; MFB spectral entropy; entropy; emotion recognition; rejection;

D O I：

10.1007/s11431-009-0204-3

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech. The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform (FFT) spectral entropy, delta FFT spectral entropy, Mel-frequency filter bank (MFB) spectral entropy, and Delta MFB spectral entropy. Spectral-based entropy features are simple. They reflect frequency characteristic and changing characteristic in frequency of speech. We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores. This reduces the false recognition rate to improve overall performance. Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results, respectively. These scores are first obtained from a pattern recognition procedure. The pattern recognition phase uses the Gaussian mixture model (GMM). We classify the four emotional states as anger, sadness, happiness and neutrality. The proposed method is evaluated using 45 sentences in each emotion for 30 subjects, 15 males and 15 females. Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy, Zero Crossing Rate (ZCR), linear prediction coefficient (LPC), and pitch parameters. We demonstrate the effectiveness of the proposed approach. One of the proposed features, combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods. We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score.

引用

页码：1838 / 1848

页数：11

共 50 条

[41] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
Fayek, Haytham M.
Lech, Margaret
Cavedon, Lawrence
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
[42] Emotion Recognition using Acoustic and Lexical Features
Rozgic, Viktor
Ananthakrishnan, Sankaranarayanan
Saleem, Shirin
Kumar, Rohit
Vembu, Aravind Namandi
Prasad, Rohit
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 366 - 369
[43] LATE INTEGRATION OF FEATURES FOR ACOUSTIC EMOTION RECOGNITION
Cullen, Ailbhe
Harte, Naomi
[J]. 2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
[44] Improving speech emotion recognition based on acoustic words emotion dictionary
Wei, Wang
Cao, Xinyi
Li, He
Shen, Lingjie
Feng, Yaqin
Watters, Paul A.
[J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (06) : 747 - 761
[45] Prominence features: Effective emotional features for speech emotion recognition
Jing, Shaoling
Mao, Xia
Chen, Lijiang
[J]. DIGITAL SIGNAL PROCESSING, 2018, 72 : 216 - 231
[46] Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition
Yildirim, Huseyin Ediz
Iren, Deniz
[J]. 2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
[47] A Study on a Speech Emotion Recognition System with Effective Acoustic Features Using Deep Learning Algorithms
Byun, Sung-Woo
Lee, Seok-Pil
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 15
[48] Speech Emotion Recognition Based on Acoustic Segment Model
Zheng, Siyuan
Du, Jun
Zhou, Hengshun
Bai, Xue
Lee, Chin-Hui
Li, Shipeng
[J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[49] Emotion recognition and acoustic analysis from speech signal
Park, CH
Sim, KB
[J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2594 - 2598
[50] Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model
Li, Xingfeng
Akagi, Masato
[J]. SPEECH COMMUNICATION, 2019, 110 : 1 - 12

← 1 2 3 4 5 →