UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

被引:0
|
作者
Wang, Shuai [1 ,2 ]
Ratnavelu, K. [2 ]
Bin Shibghatullah, Abdul Samad [2 ,3 ]
机构
[1] Yuncheng Univ, Shanxi Prov Optoelect Informat Sci & Technol Lab, Yuncheng 044000, Peoples R China
[2] UCSI Univ, Inst Comp Sci & Digital Innovat, Fac Appl Sci, 1 Jalan Menara Gading, Cheras 56000, Kuala Lumpur, Malaysia
[3] Univ Tenaga Nas, Coll Engn, Jalan Kajang Puchong, Kajang 43009, Selangor, Malaysia
关键词
Uncertainty estimation; Multimodal representation; Sentiment analysis; Decision fusion; Social media;
D O I
10.1007/s10489-024-06113-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid evolution of the digital era has greatly transformed social media, resulting in more diverse emotional expressions and increasingly complex public discourse. Consequently, identifying relationships within multimodal data has become increasingly challenging. Most current multimodal sentiment analysis (MSA) methods concentrate on merging data from diverse modalities into an integrated feature representation to enhance recognition performance by leveraging the complementary nature of multimodal data. However, these approaches often overlook prediction reliability. To address this, we propose the uncertainty estimation fusion network (UEFN), a reliable MSA method based on uncertainty estimation. UEFN combines the Dirichlet distribution and Dempster-Shafer evidence theory (DSET) to predict the probability distribution and uncertainty of text, speech, and image modalities, fusing the predictions at the decision level. Specifically, the method first represents the contextual features of text, speech, and image modalities separately. It then employs a fully connected neural network to transform features from different modalities into evidence forms. Subsequently, it parameterizes the evidence of different modalities via the Dirichlet distribution and estimates the probability distribution and uncertainty for each modality. Finally, we use DSET to fuse the predictions, obtaining the sentiment analysis results and uncertainty estimation, referred to as the multimodal decision fusion layer (MDFL). Additionally, on the basis of the modality uncertainty generated by subjective logic theory, we calculate feature weights, apply them to the corresponding features, concatenate the weighted features, and feed them into a feedforward neural network for sentiment classification, forming the adaptive weight fusion layer (AWFL). Both MDFL and AWFL are then used for multitask training. Experimental comparisons demonstrate that the UEFN not only achieves excellent performance but also provides uncertainty estimation along with the predictions, enhancing the reliability and interpretability of the results.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] AdaMoW: Multimodal Sentiment Analysis Based on Adaptive Modality-Specific Weight Fusion Network
    Zhang, Junling
    Wu, Xuemei
    Huang, Changqin
    IEEE ACCESS, 2023, 11 : 48410 - 48420
  • [42] Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
    Huang, Ju
    Lu, Pengtao
    Sun, Shuifa
    Wang, Fangyi
    ELECTRONICS, 2023, 12 (16)
  • [43] Multimodal Sentiment Analysis Network Based on Distributional Transformation and Gated Cross-Modal Fusion
    Zhang, Yuchen
    Thong, Hong
    Chen, Guilin
    Alhusaini, Naji
    Zhao, Shenghui
    Wu, Cheng
    2024 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS, NANA 2024, 2024, : 496 - 503
  • [44] Tree-Based Mix-Order Polynomial Fusion Network for Multimodal Sentiment Analysis
    Tang, Jiajia
    Hou, Ming
    Jin, Xuanyu
    Zhang, Jianhai
    Zhao, Qibin
    Kong, Wanzeng
    SYSTEMS, 2023, 11 (01):
  • [45] A transformer-encoder-based multimodal multi-attention fusion network for sentiment analysis
    Liu, Cong
    Wang, Yong
    Yang, Jing
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8415 - 8441
  • [46] Multi-layer cross-modality attention fusion network for multimodal sentiment analysis
    Yin Z.
    Du Y.
    Liu Y.
    Wang Y.
    Multimedia Tools and Applications, 2024, 83 (21) : 60171 - 60187
  • [47] DFNM: Dynamic Fusion Network of Intra- and Inter-modalities for Multimodal Sentiment Analysis
    Li, Zebin
    Ma, Junteng
    Li, Xia
    Pan, Xin
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 346 - 351
  • [48] Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis
    Qi, Qingfu
    Lin, Liyuan
    Zhang, Rui
    INFORMATION, 2021, 12 (09)
  • [49] AtCAF: Attention-based causality-aware fusion network for multimodal sentiment analysis
    Huang, Changqin
    Chen, Jili
    Huang, Qionghao
    Wang, Shijin
    Tu, Yaxin
    Huang, Xiaodi
    INFORMATION FUSION, 2025, 114
  • [50] Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
    Han, Wei
    Chen, Hui
    Poria, Soujanya
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9180 - 9192