Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition

被引：126

作者：

Li, Chao ^{[1
]}

Bao, Zhongtian ^{[1
]}

Li, Linhao ^{[2
,3
]}

Zhao, Ziping ^{[1
]}

机构：

[1] Tianjin Normal Univ, Coll Comp & Informat Engn, Tianjin 300387, Peoples R China

[2] Hebei Univ Technol, Sch Artificial Intellgence, Tianjin 300401, Peoples R China

[3] Hebei Univ Technol, Hebei Prov Key Lab Big Data Comp, Tianjin 300401, Peoples R China

来源：

INFORMATION PROCESSING & MANAGEMENT | 2020年 / 57卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Emotion recognition; EEG signals; Physiological signals; Deep learning; Multimedia content; Multi-modal fusion; CLASSIFICATION; MODELS;

D O I：

10.1016/j.ipm.2019.102185

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Emotional recognition contributes to automatically perceive the user's emotional response to multimedia content through implicit annotation, which further benefits establishing effective user-centric services. Physiological-based ways have increasingly attract researcher's attention because of their objectiveness on emotion representation. Conventional approaches to solve emotion recognition have mostly focused on the extraction of different kinds of hand-crafted features. However, hand-crafted feature always requires domain knowledge for the specific task, and designing the proper features may be more time consuming. Therefore, exploring the most effective physiological-based temporal feature representation for emotion recognition becomes the core problem of most works. In this paper, we proposed a multimodal attention-based BLSTM network framework for efficient emotion recognition. Firstly, raw physiological signals from each channel are transformed to spectrogram image for capturing their time and frequency information. Secondly, Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) are utilized to automatically learn the best temporal features. The learned deep features are then fed into a deep neural network (DNN) to predict the probability of emotional output for each channel. Finally, decision level fusion strategy is utilized to predict the final emotion. The experimental results on AMIGOS dataset show that our method outperforms other state of art methods.

引用

页数：9

共 50 条

[31] Multi-domain Network Intrusion Detection Based on Attention-based Bidirectional LSTM
Wang, Xiaoning
[J]. ITNEC 2023 - IEEE 6th Information Technology, Networking, Electronic and Automation Control Conference, 2023, : 805 - 810
[32] Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition
Zhao, Ziping
Bao, Zhongtian
Zhao, Yiqin
Zhang, Zixing
Cummins, Nicholas
Ren, Zhao
Schuller, Bjorn
[J]. IEEE ACCESS, 2019, 7 : 97515 - 97525
[33] A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition
Hu, Dongni
Chen, Chengxin
Zhang, Pengyuan
Li, Junfeng
Yan, Yonghong
Zhao, Qingwei
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08) : 1391 - 1394
[34] An End-to-End Transformer with Progressive Tri-Modal Attention for Multi-modal Emotion Recognition
Wu, Yang
Peng, Pai
Zhang, Zhenyu
Zhao, Yanyan
Qin, Bing
[J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII, 2024, 14431 : 396 - 408
[35] Attention-based convolutional neural network with multi-modal temporal information fusion for motor imagery EEG decoding
Ma X.
Chen W.
Pei Z.
Zhang Y.
Chen J.
[J]. Computers in Biology and Medicine, 2024, 175
[36] Expression EEG Multimodal Emotion Recognition Method Based on the Bidirectional LSTM and Attention Mechanism
Zhao, Yifeng
Chen, Deyun
[J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
[37] AutoAMS: Automated attention-based multi-modal graph learning architecture search
Al-Sabri, Raeed
Gao, Jianliang
Chen, Jiamin
Oloulade, Babatounde Moctard
Wu, Zhenpeng
[J]. NEURAL NETWORKS, 2024, 179
[38] A Probabilistic Approach for Attention-Based Multi-Modal Human-Robot Interaction
Begum, Momotaz
Karray, Fakhri
Mann, George K. I.
Gosine, Raymond
[J]. RO-MAN 2009: THE 18TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1 AND 2, 2009, : 909 - +
[39] Reserch of Multi-modal Emotion Recognition Based on Voice and Video Images
Wang, Chuanyu
Li, Weixiang
Chen, Zhenhuan
[J]. Computer Engineering and Applications, 2024, 57 (23) : 163 - 170
[40] Emotion recognition based on multi-modal physiological signals and transfer learning
Fu, Zhongzheng
Zhang, Boning
He, Xinrun
Li, Yixuan
Wang, Haoyuan
Huang, Jian
[J]. FRONTIERS IN NEUROSCIENCE, 2022, 16

← 1 2 3 4 5 →