Speech Emotion Recognition Using Auditory Spectrogram and Cepstral Features

被引:0
|
作者
Zhao, Shujie [1 ]
Yang, Yan [1 ]
Cohen, Israel [2 ]
Zhang, Lijun [1 ]
机构
[1] Northwestern Polytech Univ, CIAIC, Xian 710072, Shaanxi, Peoples R China
[2] Technion Israel Inst Technol, IL-3200003 Haifa, Israel
基金
中国国家自然科学基金;
关键词
Emotion recognition; speech signals; machine learning; pattern recognition; feature extraction; noise;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A systematic comparison on the impact of environmental noises on key acoustic features is critical in order to transfer speech emotion recognition (SER) systems into real world applications. In this study, we investigate the noise-tolerance of different acoustic features in distinguishing various emotions by comparing the SER classification performance on clean speech signals and noisy speech signals. We extract the spectrum and cepstral parameters based on human auditory characteristics and develop machine learning algorithms to classify four types of emotions using these features. Experimental results across the clean and noisy data show that compared to cepstral features, the auditory spectrogram-based features can achieve higher recognition accuracy for low signal-to-noise ratios (SNRs), but lower accuracy for high SNRs. Gammatone filter cepstral coefficients (GFCCs) outperformed all the extracted features on the Berlin database of emotional speech (EmoDB), under all four kinds of tested noise conditions. These results show compensation relationships between auditory spectrogram-based features and cepstral features for SER with better noise robustness in real-world applications.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [31] Speech Emotion Recognition Using Local and Global Features
    Gao, Yuanbo
    Li, Baobin
    Wang, Ning
    Zhu, Tingshao
    [J]. BRAIN INFORMATICS, BI 2017, 2017, 10654 : 3 - 13
  • [32] Emotion recognition using novel speech signal features
    Tabatabaei, Talieh Seyed
    Krishnan, Sridhar
    Guergachi, Aziz
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 345 - +
  • [33] Recognition of Human Speech Emotion Using Variants of Mel-Frequency Cepstral Coefficients
    Palo, Hemanta Kumar
    Chandra, Mahesh
    Mohanty, Mihir Narayan
    [J]. ADVANCES IN SYSTEMS, CONTROL AND AUTOMATION, 2018, 442 : 491 - 498
  • [34] Emotion Recognition from Speech Signal Using Mel-Frequency Cepstral Coefficients
    Korkmaz, Onur Erdem
    Atasoy, Ayten
    [J]. 2015 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2015, : 1254 - 1257
  • [35] Multi-Input Speech Emotion Recognition Model Using Mel Spectrogram and GeMAPS
    Toyoshima, Itsuki
    Okada, Yoshifumi
    Ishimaru, Momoko
    Uchiyama, Ryunosuke
    Tada, Mayu
    [J]. SENSORS, 2023, 23 (03)
  • [36] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [37] Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference
    Kadin, Sudarsana Reddy
    Gangamohan, P.
    Gangashetty, Suryakanth, V
    Alku, Paavo
    Yegnanarayana, B.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (09) : 4459 - 4481
  • [38] Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference
    Sudarsana Reddy Kadiri
    P. Gangamohan
    Suryakanth V. Gangashetty
    Paavo Alku
    B. Yegnanarayana
    [J]. Circuits, Systems, and Signal Processing, 2020, 39 : 4459 - 4481
  • [39] Recognition of reverberant speech using full cepstral features and spectral missing data
    Palomaki, Kalle J.
    Brown, Guy J.
    Barker, Jon R.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 289 - 292
  • [40] Speech emotion recognition using cepstral features extracted with novel triangular filter banks based on bark and ERB frequency scales
    Nagarajan, Sugan
    Nettimi, Satya Sai Srinivas
    Kumar, Lakshmi Sutha
    Nath, Malaya Kumar
    Kanhe, Aniruddha
    [J]. DIGITAL SIGNAL PROCESSING, 2020, 104