Emotion Recognition from Human Speech Using Temporal Information and Deep Learning

被引:28
|
作者
Kim, John W. [1 ]
Saurous, Rif A. [2 ]
机构
[1] Menlo Sch, Atherton, CA USA
[2] Google Inc, Mountain View, CA USA
关键词
emotion recognition; temporal information; deep learning; CNN; LSTM;
D O I
10.21437/Interspeech.2018-1132
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition by machine is a challenging task, but it has great potential to make empathic human-machine communications possible. In conventional approaches that consist of feature extraction and classifier stages, extensive studies have devoted their effort to developing good feature representations, but relatively little effort was made to make proper use of the important temporal information in these features. In this paper, we propose a model combining features known to be useful for emotion recognition and deep neural networks to exploit temporal information when recognizing emotion status. A benchmark evaluation on EMO-DB demonstrates that the proposed model achieves a state-of-the-art performance of 88.9% recognition rate.
引用
下载
收藏
页码:937 / 940
页数:4
相关论文
共 50 条
  • [21] Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models
    Abbaschian, Babak Joze
    Sierra-Sosa, Daniel
    Elmaghraby, Adel
    SENSORS, 2021, 21 (04) : 1 - 27
  • [22] Pattern recognition and features selection for speech emotion recognition model using deep learning
    Jermsittiparsert, Kittisak
    Abdurrahman, Abdurrahman
    Siriattakul, Parinya
    Sundeeva, Ludmila A.
    Hashim, Wahidah
    Rahim, Robbi
    Maseleno, Andino
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 799 - 806
  • [23] Pattern recognition and features selection for speech emotion recognition model using deep learning
    Kittisak Jermsittiparsert
    Abdurrahman Abdurrahman
    Parinya Siriattakul
    Ludmila A. Sundeeva
    Wahidah Hashim
    Robbi Rahim
    Andino Maseleno
    International Journal of Speech Technology, 2020, 23 : 799 - 806
  • [24] An Emotion Estimation from Human Speech Using Speech Recognition and Speech Synthesize
    Kurematsu, Masaki
    Ohashi, Marina
    Kinosita, Orimi
    Hakura, Jun
    Fujita, Hamido
    NEW TRENDS IN SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES, 2008, 182 : 278 - 289
  • [25] Learning Salient Segments for Speech Emotion Recognition Using Attentive Temporal Pooling
    Xia, Xiaohan
    Jiang, Dongmei
    Sahli, Hichem
    IEEE ACCESS, 2020, 8 (08): : 151740 - 151752
  • [26] Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms
    Ma, Xi
    Wu, Zhiyong
    Jia, Jia
    Xu, Mingxing
    Meng, Helen
    Cai, Lianhong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3683 - 3687
  • [27] A Deep Learning Approach for Speech Emotion Recognition Optimization Using Meta-Learning
    Ottoni, Lara Toledo Cordeiro
    Ottoni, Andre Luiz Carvalho
    Cerqueira, Jes de Jesus Fiais
    ELECTRONICS, 2023, 12 (23)
  • [28] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154
  • [29] SPEECH EMOTION RECOGNITION-A DEEP LEARNING APPROACH
    Asiya, U. A.
    Kiran, V. K.
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 867 - 871
  • [30] Survey of Deep Representation Learning for Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Qadir, Junaid
    Schuller, Bjorn
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (02) : 1634 - 1654