An Experimental Study of Speech Emotion Recognition Based on Deep Convolutional Neural Networks

被引:0
|
作者
Zheng, W. Q. [1 ]
Yu, J. S. [1 ]
Zou, Y. X. [1 ]
机构
[1] Peking Univ, Sch Elect Comp Engn, ADSPLAB ELIP, Shenzhen, Peoples R China
关键词
speech emotion recognition; deep convolutional neural networks; principle component analysis whitening; speech spectrogram; CLASSIFICATION; FEATURES; SVM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.
引用
收藏
页码:827 / 831
页数:5
相关论文
共 50 条
  • [41] An Experimental Study on Speech Enhancement Based on Deep Neural Networks
    Xu, Yong
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (01) : 65 - 68
  • [42] Constructing Speech Emotion Recognition Model Based on Convolutional Neural Network
    Kuo, Jong-Yih
    Chen, Zhao-Ming
    Lin, Hui-Chi
    [J]. 2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021), 2021, : 52 - 56
  • [43] Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition
    Qian, Yanmin
    Bi, Mengxiao
    Tan, Tian
    Yu, Kai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2263 - 2276
  • [44] Maxout neurons for deep convolutional and LSTM neural networks in speech recognition
    Cai, Meng
    Liu, Jia
    [J]. SPEECH COMMUNICATION, 2016, 77 : 53 - 64
  • [45] Factored deep convolutional neural networks for noise robust speech recognition
    Fujimoto, Masakiyo
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3837 - 3841
  • [46] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165
  • [47] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34
  • [48] Design of a Convolutional Neural Network for Speech Emotion Recognition
    Lee, Kyong Hee
    Kim, Do Hyun
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
  • [49] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
    Parthasarathy, Srinivas
    Tashev, Ivan
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
  • [50] Emotion Recognition in Horses with Convolutional Neural Networks
    Corujo, Luis A.
    Kieson, Emily
    Schloesser, Timo
    Gloor, Peter A.
    [J]. FUTURE INTERNET, 2021, 13 (10):