Multi-task learning and mutual information maximization with crossmodal transformer for multimodal sentiment analysis

被引：1

作者：

Shi, Yang ^{[1
]}

Cai, Jinglang ^{[1
]}

Liao, Lei ^{[1
]}

机构：

[1] Sichuan Normal Univ, Coll Phys & Elect Engn, Chengdu 610101, Peoples R China

来源：

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS | 2024年

关键词：

Multimodal sentiment analysis; Multi-Task learning; Mutual information maximization; Crossmodal transformer;

D O I：

10.1007/s10844-024-00858-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The effectiveness of multimodal sentiment analysis hinges on the seamless integration of information from diverse modalities, where the quality of modality fusion directly influences sentiment analysis accuracy. Prior methods often rely on intricate fusion strategies, elevating computational costs and potentially yielding inaccurate multimodal representations due to distribution gaps and information redundancy across heterogeneous modalities. This paper centers on the backpropagation of loss and introduces a Transformer-based model called Multi-Task Learning and Mutual Information Maximization with Crossmodal Transformer (MMMT). Addressing the issue of inaccurate multimodal representation for MSA, MMMT effectively combines mutual information maximization with crossmodal Transformer to convey more modality-invariant information to multimodal representation, fully exploring modal commonalities. Notably, it utilizes multi-modal labels for uni-modal training, presenting a fresh perspective on multi-task learning in MSA. Comparative experiments on the CMU-MOSI and CMU-MOSEI datasets demonstrate that MMMT improves model accuracy while reducing computational burden, making it suitable for resource-constrained and real-time performance-requiring application scenarios. Additionally, ablation experiments validate the efficacy of multi-task learning and probe the specific impact of combining mutual information maximization with Transformer in MSA.

引用

页码：1 / 19

页数：19

共 50 条

[1] Multimodal Sentiment Recognition With Multi-Task Learning
Zhang, Sun
Yin, Chunyong
Yin, Zhichao
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
[2] Multimodal sentiment analysis model based on multi-task learning and stacked cross-modal Transformer
Chen Q.-H.
Sun J.-J.
Lou Y.-B.
Fang Z.-J.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (12): : 2421 - 2429
[3] A text guided multi-task learning network for multimodal sentiment analysis
Luo, Yuanyi
Wu, Rui
Liu, Jiafeng
Tang, Xianglong
NEUROCOMPUTING, 2023, 560
[4] Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning
Yang, Bo
Wu, Lijun
Zhu, Jinhua
Shao, Bo
Lin, Xiaola
Liu, Tie-Yan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2015 - 2024
[5] Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
Han, Wei
Chen, Hui
Poria, Soujanya
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9180 - 9192
[6] Multi-Task Momentum Distillation for Multimodal Sentiment Analysis
Lin, Ronghao
Hu, Haifeng
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 549 - 565
[7] Multi-level Multi-task representation learning with adaptive fusion for multimodal sentiment analysis
Chuanbo Zhu
Min Chen
Haomin Li
Sheng Zhang
Han Liang
Chao Sun
Yifan Liu
Jincai Chen
Neural Computing and Applications, 2025, 37 (3) : 1491 - 1508
[8] Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning
Cai, Yujian
Li, Xingguang
Zhang, Yingyu
Li, Jinsong
Zhu, Fazheng
Rao, Lin
SCIENTIFIC REPORTS, 2025, 15 (01):
[9] A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis
Geng, Wenxiu
Li, Xiangxian
Bian, Yulong
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 481 - 489
[10] Improving sentiment analysis with multi-task learning of negation
Barnes, Jeremy
Velldal, Erik
Ovrelid, Lilja
NATURAL LANGUAGE ENGINEERING, 2021, 27 (02) : 249 - 269

← 1 2 3 4 5 →