Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis

被引:1
|
作者
Shi, Yang [1 ]
Cai, Jinglang [1 ]
Liao, Lei [1 ]
机构
[1] Sichuan Normal Univ, Coll Phys & Elect Engn, Chengdu 610101, Peoples R China
关键词
Multimodal sentiment analysis; Multi-Task learning; Mutual information maximization; Crossmodal transformer;
D O I
10.1007/s10844-024-00858-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The effectiveness of multimodal sentiment analysis hinges on the seamless integration of information from diverse modalities, where the quality of modality fusion directly influences sentiment analysis accuracy. Prior methods often rely on intricate fusion strategies, elevating computational costs and potentially yielding inaccurate multimodal representations due to distribution gaps and information redundancy across heterogeneous modalities. This paper centers on the backpropagation of loss and introduces a Transformer-based model called Multi-Task Learning and Mutual Information Maximization with Crossmodal Transformer (MMMT). Addressing the issue of inaccurate multimodal representation for MSA, MMMT effectively combines mutual information maximization with crossmodal Transformer to convey more modality-invariant information to multimodal representation, fully exploring modal commonalities. Notably, it utilizes multi-modal labels for uni-modal training, presenting a fresh perspective on multi-task learning in MSA. Comparative experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that MMMT improves model accuracy while reducing computational burden, making it suitable for resource-constrained and real-time performance-requiring application scenarios. Additionally, ablation experiments validate the efficacy of multi-task learning and probe the specific impact of combining mutual information maximization with Transformer in MSA.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [1] Multimodal Sentiment Recognition With Multi-Task Learning
    Zhang, Sun
    Yin, Chunyong
    Yin, Zhichao
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
  • [2] Multimodal sentiment analysis model based on multi-task learning and stacked cross-modal Transformer
    Chen Q.-H.
    Sun J.-J.
    Lou Y.-B.
    Fang Z.-J.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (12): : 2421 - 2429
  • [3] A text guided multi-task learning network for multimodal sentiment analysis
    Luo, Yuanyi
    Wu, Rui
    Liu, Jiafeng
    Tang, Xianglong
    NEUROCOMPUTING, 2023, 560
  • [4] Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning
    Yang, Bo
    Wu, Lijun
    Zhu, Jinhua
    Shao, Bo
    Lin, Xiaola
    Liu, Tie-Yan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2015 - 2024
  • [5] Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
    Han, Wei
    Chen, Hui
    Poria, Soujanya
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9180 - 9192
  • [6] Multi-Task Momentum Distillation for Multimodal Sentiment Analysis
    Lin, Ronghao
    Hu, Haifeng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 549 - 565
  • [7] Multi-level Multi-task representation learning with adaptive fusion for multimodal sentiment analysis
    Chuanbo Zhu
    Min Chen
    Haomin Li
    Sheng Zhang
    Han Liang
    Chao Sun
    Yifan Liu
    Jincai Chen
    Neural Computing and Applications, 2025, 37 (3) : 1491 - 1508
  • [8] Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning
    Cai, Yujian
    Li, Xingguang
    Zhang, Yingyu
    Li, Jinsong
    Zhu, Fazheng
    Rao, Lin
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [9] A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis
    Geng, Wenxiu
    Li, Xiangxian
    Bian, Yulong
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 481 - 489
  • [10] Improving sentiment analysis with multi-task learning of negation
    Barnes, Jeremy
    Velldal, Erik
    Ovrelid, Lilja
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (02) : 249 - 269