Recognition of emotion from speech using evolutionary cepstral coefficients

被引：5

作者：

Bakhshi, Ali ^{[1
]}

Chalup, Stephan ^{[1
]}

Harimi, Ali ^{[2
]}

Mirhassani, Seyed Mostafa ^{[3
]}

机构：

[1] Univ Newcastle, Sch Elect Engn & Comp, Newcastle, NSW, Australia

[2] Islamic Azad Univ, Dept Elect Engn, Shahrood Branch, Shahrood, Iran

[3] Univ Malaya, Dept Biomed Engn, Kuala Lumpur, Malaysia

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2020年 / 79卷 / 47-48期

关键词：

Genetic algorithm; Mel filterbank; Cepstral coefficients; Speech emotion recognition; SPECTRAL FEATURES; FEATURE-EXTRACTION; NEURAL-NETWORK; CLASSIFICATION; ALGORITHM; FUSION; MFCC;

D O I：

10.1007/s11042-020-09591-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An optimal representation of acoustic features is an ongoing challenge in automatic speech emotion recognition research. In this study, we proposed Cepstral coefficients based on evolutionary filterbanks as emotional features. It is difficult to guarantee that an individual optimized filterbank provides the best representation for emotion classification. Consequently, we employed six HMM-based binary classifiers that used a specific filterbank, which was optimized by a genetic algorithm to categorize the data into seven emotion classes. These optimized classifiers were applied in a hierarchical manner and outperformed conventional Mel Frequency Cepstral Coefficients in terms of overall emotion classification accuracy. The proposed method using evolutionary-based Cepstral coefficients achieved a weighted average recall of 87.29% on the Berlin database while the same approach but using conventional Cepstral features achieved only 79.63%.

引用

页码：35739 / 35759

页数：21

共 50 条

[31] POWER-NORMALIZED CEPSTRAL COEFFICIENTS (PNCC) FOR ROBUST SPEECH RECOGNITION
Kim, Chanwoo
Stern, Richard M.
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4101 - 4104
[32] Evolutionary feature generation in speech emotion recognition
Schuller, Bjorn
Reiter, Stephan
Rigoll, Gerhard
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 5 - +
[33] Spectral peak-weighted liftering of cepstral coefficients for speech recognition
Kim, HK
Lee, HS
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (07) : 1540 - 1549
[34] Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
Raikar, Aditya
Gandhi, Ami
Patil, Hemant A.
[J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 405 - 413
[35] Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition
Pavez, Eduardo
Silva, Jorge F.
[J]. SPEECH COMMUNICATION, 2012, 54 (06) : 814 - 835
[36] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
Darch, Jonathan
Milner, Ben
Vaseghi, Saeed
[J]. Journal of the Acoustical Society of America, 2009, 124 (06): : 3989 - 4000
[37] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
Yapanel, UH
Dharanipragada, S
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
[38] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
Darch, Jonathan
Milner, Ben
Vaseghi, Saeed
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (06): : 3989 - 4000
[39] Predictive trellis-coded quantization of the cepstral coefficients for the distributed speech recognition
Kang, Sangwon
Lee, Joonseok
[J]. IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (06) : 1570 - 1572
[40] Combining Mel Frequency Cepstral Coefficients and Fractal Dimensions for Automatic Speech Recognition
Ezeiza, Aitzol
Lopez de Ipina, Karmele
Hernandez, Carmen
Barroso, Nora
[J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 183 - +

← 1 2 3 4 5 →