MIA-Net: Multi-Modal Interactive Attention Network for Multi-Modal Affective Analysis

被引:0
|
作者
Li, Shuzhen [1 ,2 ,3 ]
Zhang, Tong [1 ,2 ,3 ]
Chen, Bianna [1 ,2 ,3 ]
Chen, C. L. Philip [1 ,2 ,3 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangdong Prov Key Lab Computat Intelligence & Cyb, Guangzhou 510006, Peoples R China
[2] Pazhou Lab, Guangzhou 510335, Peoples R China
[3] Minist Educ Hlth Intelligent Percept & Paralleled, Engn Res Ctr, Guangzhou, Peoples R China
关键词
Multi-modal affective analysis; multi-modal fusion; multi-modal interactive attention; SENTIMENT ANALYSIS; FUSION NETWORK; MODEL;
D O I
10.1109/TAFFC.2023.3259010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When a multi-modal affective analysis model generalizes from a bimodal task to a trimodal or multi-modal task, it is usually transformed into a hierarchical fusion model based on every two pairwise modalities, similar to a binary tree structure. This easily leads to large growth in model parameters and computation as the number of modalities increases, which limits the model's generalization. Moreover, many multi-modal fusion methods ignore that different modalities contribute differently to affective analysis. To tackle these challenges, this article proposes a general multi-modal fusion model that supports trimodal or multi-modal affective analysis tasks, called Multi-modal Interactive Attention Network (MIA-Net). Instead of treating different modalities equally, MIA-Net takes the modality that contributes the most to emotion as the main modality and the others as auxiliary modalities. MIA-Net introduces multi-modal interactive attention modules to adaptively select the important information of each auxiliary modality one by one to improve the main-modal representation. Moreover, MIA-Net enables quick generalization to trimodal or multi-modal tasks through stacking multiple MIA modules, which maintains efficient training and only requires linear computation and stable parameter counts. Experimental results of the transfer, generalization, and efficiency experiments on the widely-used datasets demonstrate the effectiveness and generalization of the proposed method.
引用
收藏
页码:2796 / 2809
页数:14
相关论文
共 50 条
  • [1] MBIAN: Multi-level bilateral interactive attention network for multi-modal
    Sun, Kai
    Zhang, Jiangshe
    Wang, Jialin
    Xu, Shuang
    Zhang, Chunxia
    Hu, Junying
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [2] Multi-Modal Sentiment Analysis Based on Interactive Attention Mechanism
    Wu, Jun
    Zhu, Tianliang
    Zheng, Xinli
    Wang, Chunzhi
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (16):
  • [3] Interactive multi-modal suturing
    Payandeh, Shahram
    Shi, Fuhan
    [J]. VIRTUAL REALITY, 2010, 14 (04) : 241 - 253
  • [4] Interactive multi-modal suturing
    Shahram Payandeh
    Fuhan Shi
    [J]. Virtual Reality, 2010, 14 : 241 - 253
  • [5] Multi-modal Approach for Affective Computing
    Siddharth
    Jung, Tzyy-Ping
    Sejnowski, Terrence J.
    [J]. 2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 291 - 294
  • [6] Multi-modal network Protocols
    Balan, RK
    Akella, A
    Seshan, S
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2002, 32 (01) : 60 - 60
  • [7] CA-Net: Collaborative Attention Network for Multi-modal Diagnosis of Gliomas
    Yin, Baocai
    Cheng, Hu
    Wang, Fengyan
    Wang, Zengfu
    [J]. BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 : 52 - 62
  • [8] Interactive multi-modal robot programming
    Iba, S
    Paredis, CJJ
    Khosla, PK
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, 2002, : 161 - 168
  • [9] Multi-modal navigation for interactive wheelchair
    Li, X
    Tan, TN
    Zhao, XJ
    [J]. ADVANCES IN MULTIMODAL INTERFACES - ICMI 2000, PROCEEDINGS, 2000, 1948 : 590 - 598
  • [10] Interactive multi-modal robot programming
    Iba, S
    Paredis, CJJ
    Khosla, PK
    [J]. EXPERIMENTAL ROBOTICS IX, 2006, 21 : 503 - +