Enhancing Cross-Modal Alignment in Multimodal Sentiment Analysis via Prompt Learning

被引:0
|
作者
Wang, Xiaofan [1 ]
Li, Xiuhong [1 ]
Li, Zhe [2 ,3 ]
Zhou, Chenyu [1 ]
Chen, Fan [1 ]
Yang, Dan [1 ]
机构
[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi, Peoples R China
[2] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China
[3] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
关键词
Prompt learning; Multimodal Sentiment Analysis; Alignment;
D O I
10.1007/978-981-97-8620-6_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal sentiment analysis (MSA) aims to predict the sentiment expressed in paired images and texts. Cross-modal feature alignment is crucial for models to understand the context and extract complementary semantic features. However, most previous MSA tasks have shown deficiencies in aligning features across different modalities. Experimental evidence shows that prompt learning can effectively align features, and previous studies have applied prompt learning to MSA tasks, but only in an unimodal context. Applying prompt learning to multimodal feature alignment remains a challenge. This paper employs a multimodal sentiment analysis model based on alignment prompts (MSAPL). Our model generates text and image alignment prompts via the Kronecker Product, enhancing visual modality engagement and the correlation between graphical and textual data, thus enabling a better understanding of multimodal data. Simultaneously, it employs a multi-layer, stepwise learning approach to acquire textual and image features, progressively modeling stage-feature relationships for rich contextual learning. Our experiments on three public datasets demonstrate that our model consistently outperforms all baseline models.
引用
收藏
页码:541 / 554
页数:14
相关论文
共 50 条
  • [1] Hybrid cross-modal interaction learning for multimodal sentiment analysis
    Fu, Yanping
    Zhang, Zhiyuan
    Yang, Ruidi
    Yao, Cuiyou
    NEUROCOMPUTING, 2024, 571
  • [2] Cross-Modal Modulating for Multimodal Sentiment Analysis
    Cheng, Zichen
    Li, Yan
    Ge, Jiangwei
    Jiu, Mengfei
    Zhang, Jingwei
    Computer Engineering and Applications, 2023, 59 (10) : 171 - 179
  • [3] Cross-modal contrastive learning for multimodal sentiment recognition
    Yang, Shanliang
    Cui, Lichao
    Wang, Lei
    Wang, Tao
    APPLIED INTELLIGENCE, 2024, 54 (05) : 4260 - 4276
  • [4] Cross-modal contrastive learning for multimodal sentiment recognition
    Shanliang Yang
    Lichao Cui
    Lei Wang
    Tao Wang
    Applied Intelligence, 2024, 54 : 4260 - 4276
  • [5] Mual: enhancing multimodal sentiment analysis with cross-modal attention and difference loss
    Deng, Yang
    Li, Yonghong
    Xian, Sidong
    Li, Laquan
    Qiu, Haiyang
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (03)
  • [6] Cross-Modal Enhancement Network for Multimodal Sentiment Analysis
    Wang, Di
    Liu, Shuai
    Wang, Quan
    Tian, Yumin
    He, Lihuo
    Gao, Xinbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4909 - 4921
  • [7] Prompt Learning with Cross-Modal Feature Alignment for Visual Domain Adaptation
    Liu, Jinxing
    Xiao, Junjin
    Ma, Haokai
    Li, Xiangxian
    Qi, Zhuang
    Meng, Xiangxu
    Meng, Lei
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 416 - 428
  • [8] Enhancing Pulmonary Nodule Detection via Cross-Modal Alignment
    Zhu, Yumeng
    Xu, Yi
    Ni, Bingbing
    Zhang, Jie
    Yang, Xiaokang
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
  • [9] CMJRT: Cross-Modal Joint Representation Transformer for Multimodal Sentiment Analysis
    Xu, Meng
    Liang, Feifei
    Su, Xiangyi
    Fang, Cheng
    IEEE ACCESS, 2022, 10 : 131671 - 131679
  • [10] Multimodal Sentiment Analysis Based on Cross-Modal Joint-Encoding
    Sun, Bin
    Jiang, Tao
    Jia, Li
    Cui, Yiming
    Computer Engineering and Applications, 2024, 60 (18) : 208 - 216