Attention-LSTM-Attention Model for Speech Emotion Recognition and Analysis of IEMOCAP Database

被引:55
|
作者
Yu, Yeonguk [1 ]
Kim, Yoon-Joong [1 ]
机构
[1] Hanbat Natl Univ, Dept Comp Engn, Daejeon 34158, South Korea
关键词
speech-emotion recognition; attention mechanism; LSTM;
D O I
10.3390/electronics9050713
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a speech-emotion recognition (SER) model with an "attention-long Long Short-Term Memory (LSTM)-attention" component to combine IS09, a commonly used feature for SER, and mel spectrogram, and we analyze the reliability problem of the interactive emotional dyadic motion capture (IEMOCAP) database. The attention mechanism of the model focuses on emotion-related elements of the IS09 and mel spectrogram feature and the emotion-related duration from the time of the feature. Thus, the model extracts emotion information from a given speech signal. The proposed model for the baseline study achieved a weighted accuracy (WA) of 68% for the improvised dataset of IEMOCAP. However, the WA of the proposed model of the main study and modified models could not achieve more than 68% in the improvised dataset. This is because of the reliability limit of the IEMOCAP dataset. A more reliable dataset is required for a more accurate evaluation of the model's performance. Therefore, in this study, we reconstructed a more reliable dataset based on the labeling results provided by IEMOCAP. The experimental results of the model for the more reliable dataset confirmed a WA of 73%.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Speech Emotion Recognition Using Convolutional-Recurrent Neural Networks with Attention Model
    Mu, Yawei
    Gomez, Hernandez
    Cano Montes, Antonio
    Alcaraz Martinez, Carlos
    Wang, Xuetian
    Gao, Hongmin
    2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE AND INTERNET TECHNOLOGY, CII 2017, 2017, : 341 - 350
  • [32] Attention-Emotion-Enhanced Convolutional LSTM for Sentiment Analysis
    Huang, Faliang
    Li, Xuelong
    Yuan, Changan
    Zhang, Shichao
    Zhang, Jilian
    Qiao, Shaojie
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) : 4332 - 4345
  • [33] Video Emotion Recognition Based on Hierarchical Attention Model
    Wang X.
    Pan L.
    Peng M.
    Hu M.
    Jin C.
    Ren F.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (01): : 27 - 35
  • [34] An Attention Model for Group-Level Emotion Recognition
    Gupta, Aarush
    Agrawal, Dakshit
    Chauhan, Hardik
    Dolz, Jose
    Pedersoli, Marco
    ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 611 - 615
  • [35] EEG Emotion Recognition Model Based on Attention and GAN
    Qiao, Wenxuan
    Sun, Li
    Wu, Jinhui
    Wang, Pinshuo
    Li, Jiubo
    Zhao, Minjie
    IEEE ACCESS, 2024, 12 : 32308 - 32319
  • [36] Speech Emotion Recognition Using Convolutional Neural Networks with Attention Mechanism
    Mountzouris, Konstantinos
    Perikos, Isidoros
    Hatzilygeroudis, Ioannis
    Corchado, Juan M.
    Iglesias, Carlos A.
    Kim, Byung-Gyu
    Mehmood, Rashid
    Ren, Fuji
    Lee, In
    ELECTRONICS, 2023, 12 (20)
  • [37] SPEECH EMOTION RECOGNITION USING MULTI-HOP ATTENTION MECHANISM
    Yoon, Seunghyun
    Byun, Seokhyun
    Dey, Subhadeep
    Jung, Kyomin
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2822 - 2826
  • [38] Speech Emotion Recognition via Multi-Level Attention Network
    Liu, Ke
    Wang, Dekui
    Wu, Dongya
    Liu, Yutao
    Feng, Jun
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2278 - 2282
  • [39] BAT: Block and token self-attention for speech emotion recognition
    Lei, Jianjun
    Zhu, Xiangwei
    Wang, Ying
    NEURAL NETWORKS, 2022, 156 : 67 - 80
  • [40] Temporal Attention Convolutional Network for Speech Emotion Recognition with Latent Representation
    Liu, Jiaxing
    Liu, Zhilei
    Wang, Longbiao
    Gao, Yuan
    Guo, Lili
    Dang, Jianwu
    INTERSPEECH 2020, 2020, : 2337 - 2341