Learning Modality-Complementary and Eliminating-Redundancy Representations with Multi-Task Learning for Multimodal Sentiment Analysis

被引:1
|
作者
Zhao, Xiaowei [1 ]
Miao, Xinyu [1 ]
Xu, Xiujuan [1 ]
Liu, Yu [1 ]
Cao, Yifei [1 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian, Peoples R China
基金
中国国家自然科学基金;
关键词
multimodal sentiment analysis; cross-modal transformer; multimodal information bottleneck; multi-task learning; label generation;
D O I
10.1109/IJCNN60899.2024.10650083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A crucial issue in multimodal language processing is representation learning. Previous works joint training the multimodal and unimodal tasks to learn the consistency and difference of modality representations. However, due to the lack of cross-modal interaction, the extraction of complementary features between modalities is not sufficient. Moreover, during multimodal fusion, the generated multimodal embeddings may be redundant, and unimodal representations also contain noise information, which negatively influence the final sentiment prediction. To this end, we construct a Modality-Complementary and EliminatingRedundancy multi-task learning model (MCER), and additionally add a cross-modal task to learn complementary features between two modal pairs through gated transformer. Then use two label generation modules to learn modality-specific and modalitycomplementary representations. Additionally, we introduce the multimodal information bottleneck (MIB) in both multimodal and unimodal tasks to filter out noise information in unimodal representations as well as learn powerful and sufficient multimodal embeddings that is free of redundancy. Last, we conduct extensive experiments on two popular sentiment analysis benchmarks, MOSI and MOSEI. Experimental results demonstrate that our model significantly outperforms the current strong baselines.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis
    Yu, Wenmeng
    Xu, Hua
    Yuan, Ziqi
    Wu, Jiele
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10790 - 10797
  • [2] Multimodal Sentiment Recognition With Multi-Task Learning
    Zhang, Sun
    Yin, Chunyong
    Yin, Zhichao
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
  • [3] A text guided multi-task learning network for multimodal sentiment analysis
    Luo, Yuanyi
    Wu, Rui
    Liu, Jiafeng
    Tang, Xianglong
    NEUROCOMPUTING, 2023, 560
  • [4] Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning
    Yang, Bo
    Wu, Lijun
    Zhu, Jinhua
    Shao, Bo
    Lin, Xiaola
    Liu, Tie-Yan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2015 - 2024
  • [5] Multi-level Multi-task representation learning with adaptive fusion for multimodal sentiment analysis
    Chuanbo Zhu
    Min Chen
    Haomin Li
    Sheng Zhang
    Han Liang
    Chao Sun
    Yifan Liu
    Jincai Chen
    Neural Computing and Applications, 2025, 37 (3) : 1491 - 1508
  • [6] Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning
    Cai, Yujian
    Li, Xingguang
    Zhang, Yingyu
    Li, Jinsong
    Zhu, Fazheng
    Rao, Lin
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [7] A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis
    Geng, Wenxiu
    Li, Xiangxian
    Bian, Yulong
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 481 - 489
  • [8] Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis
    Shi, Yang
    Cai, Jinglang
    Liao, Lei
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, : 1 - 19
  • [9] Improving sentiment analysis with multi-task learning of negation
    Barnes, Jeremy
    Velldal, Erik
    Ovrelid, Lilja
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (02) : 249 - 269
  • [10] A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning
    Wang, Lan
    Peng, Junjie
    Zheng, Cangzhi
    Zhao, Tong
    Zhu, Li'an
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)