Emotion Recognition from Human Speech Using Temporal Information and Deep Learning

被引：31

作者：

Kim, John W. ^{[1
]}

Saurous, Rif A. ^{[2
]}

机构：

[1] Menlo Sch, Atherton, CA USA

[2] Google Inc, Mountain View, CA USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

emotion recognition; temporal information; deep learning; CNN; LSTM;

D O I：

10.21437/Interspeech.2018-1132

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion recognition by machine is a challenging task, but it has great potential to make empathic human-machine communications possible. In conventional approaches that consist of feature extraction and classifier stages, extensive studies have devoted their effort to developing good feature representations, but relatively little effort was made to make proper use of the important temporal information in these features. In this paper, we propose a model combining features known to be useful for emotion recognition and deep neural networks to exploit temporal information when recognizing emotion status. A benchmark evaluation on EMO-DB demonstrates that the proposed model achieves a state-of-the-art performance of 88.9% recognition rate.

引用

页码：937 / 940

页数：4

共 50 条

[31] Evaluating deep learning architectures for Speech Emotion Recognition
Fayek, Haytham M.
Lech, Margaret
Cavedon, Lawrence
NEURAL NETWORKS, 2017, 92 : 60 - 68
[32] Lightweight Deep Learning Framework for Speech Emotion Recognition
Akinpelu, Samson
Viriri, Serestina
Adegun, Adekanmi
IEEE ACCESS, 2023, 11 : 77086 - 77098
[33] Deep Learning Techniques for Speech Emotion Recognition : A Review
Pandey, Sandeep Kumar
Shekhawat, H. S.
Prasanna, S. R. M.
2019 29TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2019, : 197 - 202
[34] SPEECH EMOTION RECOGNITION USING SEMANTIC INFORMATION
Tzirakis, Panagiotis
Anh Nguyen
Zafeiriou, Stefanos
Schuller, Bjoern W.
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6279 - 6283
[35] Speech Emotion Recognition Based on Two-Stream Deep Learning Model Using Korean Audio Information
Jo, A-Hyeon
Kwak, Keun-Chang
APPLIED SCIENCES-BASEL, 2023, 13 (04):
[36] Speech Emotion Recognition Using Gammatone Cepstral Coefficients and Deep Learning Features
Sharan, Roneel, V
2023 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES, ICMLANT, 2023, : 139 - 142
[37] Speech emotion recognition using feature fusion: a hybrid approach to deep learning
Khan, Waleed Akram
ul Qudous, Hamad
Farhan, Asma Ahmad
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (31) : 75557 - 75584
[38] Speech Emotion Recognition Using Deep Learning Transfer Models and Explainable Techniques
Kim, Tae-Wan
Kwak, Keun-Chang
APPLIED SCIENCES-BASEL, 2024, 14 (04):
[39] Deep Temporal Models using Identity Skip-Connections for Speech Emotion Recognition
Kim, Jaebok
Englebienne, Gwenn
Truong, Khiet P.
Evers, Vanessa
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1006 - 1013
[40] Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine
Han, Kun
Yu, Dong
Tashev, Ivan
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 223 - 227

← 1 2 3 4 5 →