Enhancing Cross-Modal Alignment in Multimodal Sentiment Analysis via Prompt Learning

被引：0

作者：

Wang, Xiaofan ^{[1
]}

Li, Xiuhong ^{[1
]}

Li, Zhe ^{[2
,3
]}

Zhou, Chenyu ^{[1
]}

Chen, Fan ^{[1
]}

Yang, Dan ^{[1
]}

机构：

[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi, Peoples R China

[2] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China

[3] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024 | 2025年 / 15035卷

关键词：

Prompt learning; Multimodal Sentiment Analysis; Alignment;

D O I：

10.1007/978-981-97-8620-6_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multimodal sentiment analysis (MSA) aims to predict the sentiment expressed in paired images and texts. Cross-modal feature alignment is crucial for models to understand the context and extract complementary semantic features. However, most previous MSA tasks have shown deficiencies in aligning features across different modalities. Experimental evidence shows that prompt learning can effectively align features, and previous studies have applied prompt learning to MSA tasks, but only in an unimodal context. Applying prompt learning to multimodal feature alignment remains a challenge. This paper employs a multimodal sentiment analysis model based on alignment prompts (MSAPL). Our model generates text and image alignment prompts via the Kronecker Product, enhancing visual modality engagement and the correlation between graphical and textual data, thus enabling a better understanding of multimodal data. Simultaneously, it employs a multi-layer, stepwise learning approach to acquire textual and image features, progressively modeling stage-feature relationships for rich contextual learning. Our experiments on three public datasets demonstrate that our model consistently outperforms all baseline models.

引用

页码：541 / 554

页数：14

共 50 条

[21] Multimodal Graph Learning for Cross-Modal Retrieval
Xie, Jingyou
Zhao, Zishuo
Lin, Zhenzhou
Shen, Ying
PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 145 - 153
[22] A Text-Centered Shared-Private Framework via Cross-Modal Prediction for Multimodal Sentiment Analysis
Wu, Yang
Lin, Zijie
Zhao, Yanyan
Qin, Bing
Zhu, Li-Nan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4730 - 4738
[23] Prompt Learning for Multimodal Intent Recognition with Modal Alignment Perception
Chen, Yuzhao
Zhu, Wenhua
Yu, Weilun
Xue, Hongfei
Fu, Hao
Lin, Jiali
Jiang, Dazhi
COGNITIVE COMPUTATION, 2024, 16 (06) : 3417 - 3428
[24] Which is Making the Contribution: Modulating Unimodal and Cross-modal Dynamics for Multimodal Sentiment Analysis
Zeng, Ying
Mai, Sijie
Hu, Haifeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1262 - 1274
[25] Multimodal Sentiment Analysis Network Based on Distributional Transformation and Gated Cross-Modal Fusion
Zhang, Yuchen
Thong, Hong
Chen, Guilin
Alhusaini, Naji
Zhao, Shenghui
Wu, Cheng
2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024, 2024, : 496 - 503
[26] Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
Huang, Ju
Lu, Pengtao
Sun, Shuifa
Wang, Fangyi
ELECTRONICS, 2023, 12 (16)
[27] Video multimodal sentiment analysis using cross-modal feature translation and dynamical propagation
Gan, Chenquan
Tang, Yu
Fu, Xiang
Zhu, Qingyi
Jain, Deepak Kumar
Garcia, Salvador
KNOWLEDGE-BASED SYSTEMS, 2024, 299
[28] Cross-modal dynamic sentiment annotation for speech sentiment analysis
Chen, Jincai
Sun, Chao
Zhang, Sheng
Zeng, Jiangfeng
COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
[29] Cross-modal complementary network with hierarchical fusion for multimodal sentiment classification
Peng, Cheng
Zhang, Chunxia
Xue, Xiaojun
Gao, Jiameng
Liang, Hongjian
Niu, Zhengdong
TSINGHUA SCIENCE AND TECHNOLOGY, 2022, 27 (04) : 664 - 679
[30] Cross-Modal Complementary Network with Hierarchical Fusion for Multimodal Sentiment Classification
Cheng Peng
Chunxia Zhang
Xiaojun Xue
Jiameng Gao
Hongjian Liang
Zhengdong Niu
Tsinghua Science and Technology, 2022, 27 (04) : 664 - 679

← 1 2 3 4 5 →