Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition

被引:126
|
作者
Li, Chao [1 ]
Bao, Zhongtian [1 ]
Li, Linhao [2 ,3 ]
Zhao, Ziping [1 ]
机构
[1] Tianjin Normal Univ, Coll Comp & Informat Engn, Tianjin 300387, Peoples R China
[2] Hebei Univ Technol, Sch Artificial Intellgence, Tianjin 300401, Peoples R China
[3] Hebei Univ Technol, Hebei Prov Key Lab Big Data Comp, Tianjin 300401, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; EEG signals; Physiological signals; Deep learning; Multimedia content; Multi-modal fusion; CLASSIFICATION; MODELS;
D O I
10.1016/j.ipm.2019.102185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotional recognition contributes to automatically perceive the user's emotional response to multimedia content through implicit annotation, which further benefits establishing effective user-centric services. Physiological-based ways have increasingly attract researcher's attention because of their objectiveness on emotion representation. Conventional approaches to solve emotion recognition have mostly focused on the extraction of different kinds of hand-crafted features. However, hand-crafted feature always requires domain knowledge for the specific task, and designing the proper features may be more time consuming. Therefore, exploring the most effective physiological-based temporal feature representation for emotion recognition becomes the core problem of most works. In this paper, we proposed a multimodal attention-based BLSTM network framework for efficient emotion recognition. Firstly, raw physiological signals from each channel are transformed to spectrogram image for capturing their time and frequency information. Secondly, Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) are utilized to automatically learn the best temporal features. The learned deep features are then fed into a deep neural network (DNN) to predict the probability of emotional output for each channel. Finally, decision level fusion strategy is utilized to predict the final emotion. The experimental results on AMIGOS dataset show that our method outperforms other state of art methods.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Multi-domain Network Intrusion Detection Based on Attention-based Bidirectional LSTM
    Wang, Xiaoning
    [J]. ITNEC 2023 - IEEE 6th Information Technology, Networking, Electronic and Automation Control Conference, 2023, : 805 - 810
  • [32] Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition
    Zhao, Ziping
    Bao, Zhongtian
    Zhao, Yiqin
    Zhang, Zixing
    Cummins, Nicholas
    Ren, Zhao
    Schuller, Bjorn
    [J]. IEEE ACCESS, 2019, 7 : 97515 - 97525
  • [33] A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition
    Hu, Dongni
    Chen, Chengxin
    Zhang, Pengyuan
    Li, Junfeng
    Yan, Yonghong
    Zhao, Qingwei
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08) : 1391 - 1394
  • [34] An End-to-End Transformer with Progressive Tri-Modal Attention for Multi-modal Emotion Recognition
    Wu, Yang
    Peng, Pai
    Zhang, Zhenyu
    Zhao, Yanyan
    Qin, Bing
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 396 - 408
  • [35] Attention-based convolutional neural network with multi-modal temporal information fusion for motor imagery EEG decoding
    Ma X.
    Chen W.
    Pei Z.
    Zhang Y.
    Chen J.
    [J]. Computers in Biology and Medicine, 2024, 175
  • [36] Expression EEG Multimodal Emotion Recognition Method Based on the Bidirectional LSTM and Attention Mechanism
    Zhao, Yifeng
    Chen, Deyun
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [37] AutoAMS: Automated attention-based multi-modal graph learning architecture search
    Al-Sabri, Raeed
    Gao, Jianliang
    Chen, Jiamin
    Oloulade, Babatounde Moctard
    Wu, Zhenpeng
    [J]. NEURAL NETWORKS, 2024, 179
  • [38] A Probabilistic Approach for Attention-Based Multi-Modal Human-Robot Interaction
    Begum, Momotaz
    Karray, Fakhri
    Mann, George K. I.
    Gosine, Raymond
    [J]. RO-MAN 2009: THE 18TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1 AND 2, 2009, : 909 - +
  • [39] Reserch of Multi-modal Emotion Recognition Based on Voice and Video Images
    Wang, Chuanyu
    Li, Weixiang
    Chen, Zhenhuan
    [J]. Computer Engineering and Applications, 2024, 57 (23) : 163 - 170
  • [40] Emotion recognition based on multi-modal physiological signals and transfer learning
    Fu, Zhongzheng
    Zhang, Boning
    He, Xinrun
    Li, Yixuan
    Wang, Haoyuan
    Huang, Jian
    [J]. FRONTIERS IN NEUROSCIENCE, 2022, 16