Multi-level Multi-task representation learning with adaptive fusion for multimodal sentiment analysis

被引：0

作者：

Chuanbo Zhu ^{[1
]}

Min Chen ^{[2
]}

Haomin Li ^{[3
]}

Sheng Zhang ^{[1
]}

Han Liang ^{[1
]}

Chao Sun ^{[1
]}

Yifan Liu ^{[1
]}

Jincai Chen ^{[1
]}

机构：

[1] Huazhong University of Science and Technology,Wuhan National Laboratory for Optoelectronics

[2] South China University of Technology,School of Computer Science and Engineering

[3] Pazhou Laboratory,School of Computer Science and Technology

[4] Huazhong University of Science and Technology,Key Laboratory of Information Storage System

[5] Ministry of Education of China,undefined

来源：

Neural Computing and Applications | 2025年 / 37卷 / 3期

关键词：

Multimodal sentiment analysis; Multimodal adaptive fusion; Multi-level representation; Multi-task learning;

D O I：

10.1007/s00521-024-10678-1

中图分类号：

学科分类号：

摘要：

Multimodal sentiment analysis is an active task in multimodal intelligence, which aims to compute the user’s sentiment tendency from multimedia data. Generally, each modality is a specific and necessary perspective to express human sentiment, providing complementary and consensus information unavailable in a single modality. Nevertheless, the heterogeneous multimedia data often contain inconsistent and conflicting sentiment semantics that limits the model performance. In this work, we propose a Multi-level Multi-task Representation Learning with Adaptive Fusion (MuReLAF) network to bridge the semantic gap among different modalities. Specifically, we design a modality adaptive fusion block to adjust modality contributions dynamically. Besides, we build a multi-level multimodal representations framework to obtain modality-specific and modality-shared semantics by the multi-task learning strategy, where modality-specific semantics contain complementary information and modality-shared semantics include consensus information. Extensive experiments are conducted on four publicly available datasets: MOSI, MOSEI, SIMS, and SIMSV2(s), demonstrating that our model exhibits superior or comparable performance to state-of-the-art models. The achieved accuracies are 86.28%, 86.07%, 84.46%, and 82.78%, respectively, showcasing improvements of 0.82%, 0.84%, 1.75%, and 1.83%. Further analyses also indicate the effectiveness of our model in sentiment analysis.

引用

页码：1491 / 1508

页数：17

共 50 条

[1] Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning
Cai, Yujian
Li, Xingguang
Zhang, Yingyu
Li, Jinsong
Zhu, Fazheng
Rao, Lin
SCIENTIFIC REPORTS, 2025, 15 (01):
[2] Multimodal Sentiment Recognition With Multi-Task Learning
Zhang, Sun
Yin, Chunyong
Yin, Zhichao
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
[3] Learning Multi-Level Task Groups in Multi-Task Learning
Han, Lei
Zhang, Yu
PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2638 - 2644
[4] A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning
Wang, Lan
Peng, Junjie
Zheng, Cangzhi
Zhao, Tong
Zhu, Li'an
INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
[5] A text guided multi-task learning network for multimodal sentiment analysis
Luo, Yuanyi
Wu, Rui
Liu, Jiafeng
Tang, Xianglong
NEUROCOMPUTING, 2023, 560
[6] Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning
Yang, Bo
Wu, Lijun
Zhu, Jinhua
Shao, Bo
Lin, Xiaola
Liu, Tie-Yan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2015 - 2024
[7] Multi-Task Momentum Distillation for Multimodal Sentiment Analysis
Lin, Ronghao
Hu, Haifeng
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 549 - 565
[8] Multi-level network Lasso for multi-task personalized learning
Wang, Jiankun
Fei, Luhuan
Sun, Lu
PATTERN RECOGNITION, 2025, 161
[9] Forecasting Gang Homicides with Multi-level Multi-task Learning
Akhter, Nasrin
Zhao, Liang
Arias, Desmond
Rangwala, Huzefa
Ramakrishnan, Naren
SOCIAL, CULTURAL, AND BEHAVIORAL MODELING, SBP-BRIMS 2018, 2018, 10899 : 28 - 37
[10] Multi-Task Learning and Multimodal Fusion for Road Segmentation
Cheng, Bowen
Tian, Miaomiao
Jiang, Shuai
Liu, Weiwei
Pang, Yalong
IEEE ACCESS, 2023, 11 : 18947 - 18959

← 1 2 3 4 5 →