Few-shot Multimodal Sentiment Analysis Based on Multimodal Probabilistic Fusion Prompts

被引:3
|
作者
Yang, Xiaocui [1 ]
Feng, Shi [1 ]
Wang, Daling [1 ]
Zhang, Yifei [1 ]
Poria, Soujanya [2 ]
机构
[1] Northeastern Univ, Shenyang, Peoples R China
[2] Singapore Univ Technol & Design, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
Multimodal sentiment analysis; Multimodal few-shot; Consistently distributed sampling; Unified multimodal prompt; Multimodal demonstrations; Multimodal probabilistic fusion;
D O I
10.1145/3581783.3612181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal sentiment analysis has gained significant attention due to the proliferation of multimodal content on social media. However, existing studies in this area rely heavily on large-scale supervised data, which is time-consuming and labor-intensive to collect. Thus, there is a need to address the challenge of few-shot multimodal sentiment analysis. To tackle this problem, we propose a novel method called Multimodal Probabilistic Fusion Prompts (MultiPoint(1)) that leverages diverse cues from different modalities for multimodal sentiment detection in the few-shot scenario. Specifically, we start by introducing a Consistently Distributed Sampling approach called CDS, which ensures that the few-shot dataset has the same category distribution as the full dataset. Unlike previous approaches primarily using prompts based on the text modality, we design unified multimodal prompts to reduce discrepancies between different modalities and dynamically incorporate multimodal demonstrations into the context of each multimodal instance. To enhance the model's robustness, we introduce a probabilistic fusion method to fuse output predictions from multiple diverse prompts for each input. Our extensive experiments on six datasets demonstrate the effectiveness of our approach. First, our method outperforms strong baselines in the multimodal few-shot setting. Furthermore, under the same amount of data (1% of the full dataset), our CDS-based experimental results significantly outperform those based on previously sampled datasets constructed from the same number of instances of each class.
引用
收藏
页码:6045 / 6053
页数:9
相关论文
共 50 条
  • [1] Multimodal Sentiment Analysis for Movie Scenes Based on a Few-Shot Learning Approach
    Yang, Hao
    Li, Bo
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (05)
  • [2] FMCF: Few-shot Multimodal aspect-based sentiment analysis framework based on Contrastive Finetuning
    Du, Yongping
    Xie, Runfeng
    Zhang, Bochao
    Yin, Zihao
    [J]. APPLIED INTELLIGENCE, 2024, : 12629 - 12643
  • [3] Multimodal Few-Shot Learning for Gait Recognition
    Moon, Jucheol
    Nhat Anh Le
    Minaya, Nelson Hebert
    Choi, Sang-Il
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (21): : 1 - 15
  • [4] Distribution-Agnostic Probabilistic Few-Shot Learning for Multimodal Recognition and Prediction
    Wang, Di
    Xian, Xiaochen
    Li, Haidong
    Wang, Dong
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 17
  • [5] Multimodal Prototypical Networks for Few-shot Learning
    Pahde, Frederik
    Puscas, Mihai
    Klein, Tassilo
    Nabi, Moin
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2643 - 2652
  • [6] Multimodal cross-decoupling for few-shot learning
    Ji Z.
    Wang S.
    Yu Y.
    [J]. Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2024, 46 (01): : 12 - 21
  • [7] Empowering few-shot learning: a multimodal optimization framework
    Enamoto, Liriam
    Rocha Filho, Geraldo Pereira
    Weigang, Li
    [J]. Neural Computing and Applications, 2024,
  • [8] Multimodal few-shot classification without attribute embedding
    Chang, Jun Qing
    Rajan, Deepu
    Vun, Nicholas
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2024, 2024 (01)
  • [9] Multimodal variational contrastive learning for few-shot classification
    Pan, Meihong
    Shen, Hongbin
    [J]. APPLIED INTELLIGENCE, 2024, 54 (02) : 1879 - 1892
  • [10] Direct multimodal few-shot learning of speech and images
    Nortje, Leanne
    Kamper, Herman
    [J]. INTERSPEECH 2021, 2021, : 2971 - 2975