Speech Emotion Recognition Using Auditory Spectrogram and Cepstral Features

被引:0
|
作者
Zhao, Shujie [1 ]
Yang, Yan [1 ]
Cohen, Israel [2 ]
Zhang, Lijun [1 ]
机构
[1] Northwestern Polytech Univ, CIAIC, Xian 710072, Shaanxi, Peoples R China
[2] Technion Israel Inst Technol, IL-3200003 Haifa, Israel
基金
中国国家自然科学基金;
关键词
Emotion recognition; speech signals; machine learning; pattern recognition; feature extraction; noise;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A systematic comparison on the impact of environmental noises on key acoustic features is critical in order to transfer speech emotion recognition (SER) systems into real world applications. In this study, we investigate the noise-tolerance of different acoustic features in distinguishing various emotions by comparing the SER classification performance on clean speech signals and noisy speech signals. We extract the spectrum and cepstral parameters based on human auditory characteristics and develop machine learning algorithms to classify four types of emotions using these features. Experimental results across the clean and noisy data show that compared to cepstral features, the auditory spectrogram-based features can achieve higher recognition accuracy for low signal-to-noise ratios (SNRs), but lower accuracy for high SNRs. Gammatone filter cepstral coefficients (GFCCs) outperformed all the extracted features on the Berlin database of emotional speech (EmoDB), under all four kinds of tested noise conditions. These results show compensation relationships between auditory spectrogram-based features and cepstral features for SER with better noise robustness in real-world applications.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 50 条
  • [1] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
    Sharan, Roneel, V
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
  • [2] NMF-based Cepstral Features for Speech Emotion Recognition
    Lashkari, Milad
    Seyedin, Sanaz
    [J]. 2018 4TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2018, : 189 - 193
  • [3] Experimental Analysis and Selection of Spectrogram Features for Speech Emotion Recognition
    Tang, Gui-Chen
    Liang, Rui-Yu
    Feng, Yue-Qin
    Wang, Qing-Yun
    [J]. INTERNATIONAL CONFERENCE ON MECHANICS, BUILDING MATERIAL AND CIVIL ENGINEERING (MBMCE 2015), 2015, : 757 - 762
  • [4] Speech Emotion Recognition Using Spectrogram & Phoneme Embedding
    Yenigalla, Promod
    Kumar, Abhay
    Tripathi, Suraj
    Singh, Chirag
    Kar, Sibsambhu
    Vepa, Jithendra
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3688 - 3692
  • [5] Emotion recognition based on AlexNet using speech spectrogram
    Park, Soeun
    Lee, Chul
    Kwon, Soonil
    Park, Neungsoo
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 49 - 49
  • [6] Speech recognition using cepstral articulatory features
    Najnin, Shamima
    Banerjee, Bonny
    [J]. SPEECH COMMUNICATION, 2019, 107 : 26 - 37
  • [7] Detecting Human Emotion via Speech Recognition by Using Speech Spectrogram
    Prasomphan, Sathit
    [J]. PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 113 - 122
  • [8] Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition
    Zhang, Linjuan
    Wang, Longbiao
    Dang, Jianwu
    Guo, Lili
    Guan, Haotian
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2018), PT IV, 2018, 11304 : 62 - 71
  • [9] Recognition of emotion from speech using evolutionary cepstral coefficients
    Bakhshi, Ali
    Chalup, Stephan
    Harimi, Ali
    Mirhassani, Seyed Mostafa
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35739 - 35759
  • [10] Recognition of emotion from speech using evolutionary cepstral coefficients
    Ali Bakhshi
    Stephan Chalup
    Ali Harimi
    Seyed Mostafa Mirhassani
    [J]. Multimedia Tools and Applications, 2020, 79 : 35739 - 35759