Emotion Recognition from Human Speech Using Temporal Information and Deep Learning

被引:31
|
作者
Kim, John W. [1 ]
Saurous, Rif A. [2 ]
机构
[1] Menlo Sch, Atherton, CA USA
[2] Google Inc, Mountain View, CA USA
关键词
emotion recognition; temporal information; deep learning; CNN; LSTM;
D O I
10.21437/Interspeech.2018-1132
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition by machine is a challenging task, but it has great potential to make empathic human-machine communications possible. In conventional approaches that consist of feature extraction and classifier stages, extensive studies have devoted their effort to developing good feature representations, but relatively little effort was made to make proper use of the important temporal information in these features. In this paper, we propose a model combining features known to be useful for emotion recognition and deep neural networks to exploit temporal information when recognizing emotion status. A benchmark evaluation on EMO-DB demonstrates that the proposed model achieves a state-of-the-art performance of 88.9% recognition rate.
引用
收藏
页码:937 / 940
页数:4
相关论文
共 50 条
  • [31] Evaluating deep learning architectures for Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    NEURAL NETWORKS, 2017, 92 : 60 - 68
  • [32] Lightweight Deep Learning Framework for Speech Emotion Recognition
    Akinpelu, Samson
    Viriri, Serestina
    Adegun, Adekanmi
    IEEE ACCESS, 2023, 11 : 77086 - 77098
  • [33] Deep Learning Techniques for Speech Emotion Recognition : A Review
    Pandey, Sandeep Kumar
    Shekhawat, H. S.
    Prasanna, S. R. M.
    2019 29TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2019, : 197 - 202
  • [34] SPEECH EMOTION RECOGNITION USING SEMANTIC INFORMATION
    Tzirakis, Panagiotis
    Anh Nguyen
    Zafeiriou, Stefanos
    Schuller, Bjoern W.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6279 - 6283
  • [35] Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information
    Jo, A-Hyeon
    Kwak, Keun-Chang
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [36] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
    Sharan, Roneel, V
    2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
  • [37] Speech emotion recognition using feature fusion: a hybrid approach to deep learning
    Khan, Waleed Akram
    ul Qudous, Hamad
    Farhan, Asma Ahmad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (31) : 75557 - 75584
  • [38] Speech Emotion Recognition Using Deep Learning Transfer Models and Explainable Techniques
    Kim, Tae-Wan
    Kwak, Keun-Chang
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [39] Deep Temporal Models using Identity Skip-Connections for Speech Emotion Recognition
    Kim, Jaebok
    Englebienne, Gwenn
    Truong, Khiet P.
    Evers, Vanessa
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1006 - 1013
  • [40] Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine
    Han, Kun
    Yu, Dong
    Tashev, Ivan
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 223 - 227