Recognition of emotion from speech using evolutionary cepstral coefficients

被引:5
|
作者
Bakhshi, Ali [1 ]
Chalup, Stephan [1 ]
Harimi, Ali [2 ]
Mirhassani, Seyed Mostafa [3 ]
机构
[1] Univ Newcastle, Sch Elect Engn & Comp, Newcastle, NSW, Australia
[2] Islamic Azad Univ, Dept Elect Engn, Shahrood Branch, Shahrood, Iran
[3] Univ Malaya, Dept Biomed Engn, Kuala Lumpur, Malaysia
关键词
Genetic algorithm; Mel filterbank; Cepstral coefficients; Speech emotion recognition; SPECTRAL FEATURES; FEATURE-EXTRACTION; NEURAL-NETWORK; CLASSIFICATION; ALGORITHM; FUSION; MFCC;
D O I
10.1007/s11042-020-09591-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An optimal representation of acoustic features is an ongoing challenge in automatic speech emotion recognition research. In this study, we proposed Cepstral coefficients based on evolutionary filterbanks as emotional features. It is difficult to guarantee that an individual optimized filterbank provides the best representation for emotion classification. Consequently, we employed six HMM-based binary classifiers that used a specific filterbank, which was optimized by a genetic algorithm to categorize the data into seven emotion classes. These optimized classifiers were applied in a hierarchical manner and outperformed conventional Mel Frequency Cepstral Coefficients in terms of overall emotion classification accuracy. The proposed method using evolutionary-based Cepstral coefficients achieved a weighted average recall of 87.29% on the Berlin database while the same approach but using conventional Cepstral features achieved only 79.63%.
引用
收藏
页码:35739 / 35759
页数:21
相关论文
共 50 条
  • [31] POWER-NORMALIZED CEPSTRAL COEFFICIENTS (PNCC) FOR ROBUST SPEECH RECOGNITION
    Kim, Chanwoo
    Stern, Richard M.
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4101 - 4104
  • [32] Evolutionary feature generation in speech emotion recognition
    Schuller, Bjorn
    Reiter, Stephan
    Rigoll, Gerhard
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 5 - +
  • [33] Spectral peak-weighted liftering of cepstral coefficients for speech recognition
    Kim, HK
    Lee, HS
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (07) : 1540 - 1549
  • [34] Combining Evidences from Mel Cepstral and Cochlear Cepstral Features for Speaker Recognition Using Whispered Speech
    Raikar, Aditya
    Gandhi, Ami
    Patil, Hemant A.
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 405 - 413
  • [35] Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition
    Pavez, Eduardo
    Silva, Jorge F.
    [J]. SPEECH COMMUNICATION, 2012, 54 (06) : 814 - 835
  • [36] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
    Darch, Jonathan
    Milner, Ben
    Vaseghi, Saeed
    [J]. Journal of the Acoustical Society of America, 2009, 124 (06): : 3989 - 4000
  • [37] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
    Yapanel, UH
    Dharanipragada, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
  • [38] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
    Darch, Jonathan
    Milner, Ben
    Vaseghi, Saeed
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (06): : 3989 - 4000
  • [39] Predictive trellis-coded quantization of the cepstral coefficients for the distributed speech recognition
    Kang, Sangwon
    Lee, Joonseok
    [J]. IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (06) : 1570 - 1572
  • [40] Combining Mel Frequency Cepstral Coefficients and Fractal Dimensions for Automatic Speech Recognition
    Ezeiza, Aitzol
    Lopez de Ipina, Karmele
    Hernandez, Carmen
    Barroso, Nora
    [J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 183 - +