MIA-Net: Multi-Modal Interactive Attention Network for Multi-Modal Affective Analysis

被引：0

作者：

Li, Shuzhen ^{[1
,2
,3
]}

Zhang, Tong ^{[1
,2
,3
]}

Chen, Bianna ^{[1
,2
,3
]}

Chen, C. L. Philip ^{[1
,2
,3
]}

机构：

[1] South China Univ Technol, Sch Comp Sci & Engn, Guangdong Prov Key Lab Computat Intelligence & Cyb, Guangzhou 510006, Peoples R China

[2] Pazhou Lab, Guangzhou 510335, Peoples R China

[3] Minist Educ Hlth Intelligent Percept & Paralleled, Engn Res Ctr, Guangzhou, Peoples R China

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2023年 / 14卷 / 04期

关键词：

Multi-modal affective analysis; multi-modal fusion; multi-modal interactive attention; SENTIMENT ANALYSIS; FUSION NETWORK; MODEL;

D O I：

10.1109/TAFFC.2023.3259010

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When a multi-modal affective analysis model generalizes from a bimodal task to a trimodal or multi-modal task, it is usually transformed into a hierarchical fusion model based on every two pairwise modalities, similar to a binary tree structure. This easily leads to large growth in model parameters and computation as the number of modalities increases, which limits the model's generalization. Moreover, many multi-modal fusion methods ignore that different modalities contribute differently to affective analysis. To tackle these challenges, this article proposes a general multi-modal fusion model that supports trimodal or multi-modal affective analysis tasks, called Multi-modal Interactive Attention Network (MIA-Net). Instead of treating different modalities equally, MIA-Net takes the modality that contributes the most to emotion as the main modality and the others as auxiliary modalities. MIA-Net introduces multi-modal interactive attention modules to adaptively select the important information of each auxiliary modality one by one to improve the main-modal representation. Moreover, MIA-Net enables quick generalization to trimodal or multi-modal tasks through stacking multiple MIA modules, which maintains efficient training and only requires linear computation and stable parameter counts. Experimental results of the transfer, generalization, and efficiency experiments on the widely-used datasets demonstrate the effectiveness and generalization of the proposed method.

引用

页码：2796 / 2809

页数：14

共 50 条

[1] MBIAN: Multi-level bilateral interactive attention network for multi-modal
Sun, Kai
Zhang, Jiangshe
Wang, Jialin
Xu, Shuang
Zhang, Chunxia
Hu, Junying
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
[2] Multi-Modal Sentiment Analysis Based on Interactive Attention Mechanism
Wu, Jun
Zhu, Tianliang
Zheng, Xinli
Wang, Chunzhi
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (16):
[3] Interactive multi-modal suturing
Payandeh, Shahram
Shi, Fuhan
[J]. VIRTUAL REALITY, 2010, 14 (04) : 241 - 253
[4] Interactive multi-modal suturing
Shahram Payandeh
Fuhan Shi
[J]. Virtual Reality, 2010, 14 : 241 - 253
[5] Multi-modal Approach for Affective Computing
Siddharth
Jung, Tzyy-Ping
Sejnowski, Terrence J.
[J]. 2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 291 - 294
[6] Multi-modal network Protocols
Balan, RK
Akella, A
Seshan, S
[J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2002, 32 (01) : 60 - 60
[7] CA-Net: Collaborative Attention Network for Multi-modal Diagnosis of Gliomas
Yin, Baocai
Cheng, Hu
Wang, Fengyan
Wang, Zengfu
[J]. BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 : 52 - 62
[8] Interactive multi-modal robot programming
Iba, S
Paredis, CJJ
Khosla, PK
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, 2002, : 161 - 168
[9] Multi-modal navigation for interactive wheelchair
Li, X
Tan, TN
Zhao, XJ
[J]. ADVANCES IN MULTIMODAL INTERFACES - ICMI 2000, PROCEEDINGS, 2000, 1948 : 590 - 598
[10] Interactive multi-modal robot programming
Iba, S
Paredis, CJJ
Khosla, PK
[J]. EXPERIMENTAL ROBOTICS IX, 2006, 21 : 503 - +

← 1 2 3 4 5 →