Multi-level Multi-task representation learning with adaptive fusion for multimodal sentiment analysis

被引:0
|
作者
Chuanbo Zhu [1 ]
Min Chen [2 ]
Haomin Li [3 ]
Sheng Zhang [1 ]
Han Liang [1 ]
Chao Sun [1 ]
Yifan Liu [1 ]
Jincai Chen [1 ]
机构
[1] Huazhong University of Science and Technology,Wuhan National Laboratory for Optoelectronics
[2] South China University of Technology,School of Computer Science and Engineering
[3] Pazhou Laboratory,School of Computer Science and Technology
[4] Huazhong University of Science and Technology,Key Laboratory of Information Storage System
[5] Ministry of Education of China,undefined
关键词
Multimodal sentiment analysis; Multimodal adaptive fusion; Multi-level representation; Multi-task learning;
D O I
10.1007/s00521-024-10678-1
中图分类号
学科分类号
摘要
Multimodal sentiment analysis is an active task in multimodal intelligence, which aims to compute the user’s sentiment tendency from multimedia data. Generally, each modality is a specific and necessary perspective to express human sentiment, providing complementary and consensus information unavailable in a single modality. Nevertheless, the heterogeneous multimedia data often contain inconsistent and conflicting sentiment semantics that limits the model performance. In this work, we propose a Multi-level Multi-task Representation Learning with Adaptive Fusion (MuReLAF) network to bridge the semantic gap among different modalities. Specifically, we design a modality adaptive fusion block to adjust modality contributions dynamically. Besides, we build a multi-level multimodal representations framework to obtain modality-specific and modality-shared semantics by the multi-task learning strategy, where modality-specific semantics contain complementary information and modality-shared semantics include consensus information. Extensive experiments are conducted on four publicly available datasets: MOSI, MOSEI, SIMS, and SIMSV2(s), demonstrating that our model exhibits superior or comparable performance to state-of-the-art models. The achieved accuracies are 86.28%, 86.07%, 84.46%, and 82.78%, respectively, showcasing improvements of 0.82%, 0.84%, 1.75%, and 1.83%. Further analyses also indicate the effectiveness of our model in sentiment analysis.
引用
收藏
页码:1491 / 1508
页数:17
相关论文
共 50 条
  • [1] Multimodal sentiment analysis based on multi-layer feature fusion and multi-task learning
    Cai, Yujian
    Li, Xingguang
    Zhang, Yingyu
    Li, Jinsong
    Zhu, Fazheng
    Rao, Lin
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [2] Multimodal Sentiment Recognition With Multi-Task Learning
    Zhang, Sun
    Yin, Chunyong
    Yin, Zhichao
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 200 - 209
  • [3] Learning Multi-Level Task Groups in Multi-Task Learning
    Han, Lei
    Zhang, Yu
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2638 - 2644
  • [4] A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning
    Wang, Lan
    Peng, Junjie
    Zheng, Cangzhi
    Zhao, Tong
    Zhu, Li'an
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
  • [5] A text guided multi-task learning network for multimodal sentiment analysis
    Luo, Yuanyi
    Wu, Rui
    Liu, Jiafeng
    Tang, Xianglong
    NEUROCOMPUTING, 2023, 560
  • [6] Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning
    Yang, Bo
    Wu, Lijun
    Zhu, Jinhua
    Shao, Bo
    Lin, Xiaola
    Liu, Tie-Yan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2015 - 2024
  • [7] Multi-Task Momentum Distillation for Multimodal Sentiment Analysis
    Lin, Ronghao
    Hu, Haifeng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 549 - 565
  • [8] Multi-level network Lasso for multi-task personalized learning
    Wang, Jiankun
    Fei, Luhuan
    Sun, Lu
    PATTERN RECOGNITION, 2025, 161
  • [9] Forecasting Gang Homicides with Multi-level Multi-task Learning
    Akhter, Nasrin
    Zhao, Liang
    Arias, Desmond
    Rangwala, Huzefa
    Ramakrishnan, Naren
    SOCIAL, CULTURAL, AND BEHAVIORAL MODELING, SBP-BRIMS 2018, 2018, 10899 : 28 - 37
  • [10] Multi-Task Learning and Multimodal Fusion for Road Segmentation
    Cheng, Bowen
    Tian, Miaomiao
    Jiang, Shuai
    Liu, Weiwei
    Pang, Yalong
    IEEE ACCESS, 2023, 11 : 18947 - 18959