UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

被引:0
|
作者
Wang, Shuai [1 ,2 ]
Ratnavelu, K. [2 ]
Bin Shibghatullah, Abdul Samad [2 ,3 ]
机构
[1] Yuncheng Univ, Shanxi Prov Optoelect Informat Sci & Technol Lab, Yuncheng 044000, Peoples R China
[2] UCSI Univ, Inst Comp Sci & Digital Innovat, Fac Appl Sci, 1 Jalan Menara Gading, Cheras 56000, Kuala Lumpur, Malaysia
[3] Univ Tenaga Nas, Coll Engn, Jalan Kajang Puchong, Kajang 43009, Selangor, Malaysia
关键词
Uncertainty estimation; Multimodal representation; Sentiment analysis; Decision fusion; Social media;
D O I
10.1007/s10489-024-06113-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid evolution of the digital era has greatly transformed social media, resulting in more diverse emotional expressions and increasingly complex public discourse. Consequently, identifying relationships within multimodal data has become increasingly challenging. Most current multimodal sentiment analysis (MSA) methods concentrate on merging data from diverse modalities into an integrated feature representation to enhance recognition performance by leveraging the complementary nature of multimodal data. However, these approaches often overlook prediction reliability. To address this, we propose the uncertainty estimation fusion network (UEFN), a reliable MSA method based on uncertainty estimation. UEFN combines the Dirichlet distribution and Dempster-Shafer evidence theory (DSET) to predict the probability distribution and uncertainty of text, speech, and image modalities, fusing the predictions at the decision level. Specifically, the method first represents the contextual features of text, speech, and image modalities separately. It then employs a fully connected neural network to transform features from different modalities into evidence forms. Subsequently, it parameterizes the evidence of different modalities via the Dirichlet distribution and estimates the probability distribution and uncertainty for each modality. Finally, we use DSET to fuse the predictions, obtaining the sentiment analysis results and uncertainty estimation, referred to as the multimodal decision fusion layer (MDFL). Additionally, on the basis of the modality uncertainty generated by subjective logic theory, we calculate feature weights, apply them to the corresponding features, concatenate the weighted features, and feed them into a feedforward neural network for sentiment classification, forming the adaptive weight fusion layer (AWFL). Both MDFL and AWFL are then used for multitask training. Experimental comparisons demonstrate that the UEFN not only achieves excellent performance but also provides uncertainty estimation along with the predictions, enhancing the reliability and interpretability of the results.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] A graph convolution-based heterogeneous fusion network for multimodal sentiment analysis
    Tong Zhao
    Junjie Peng
    Yansong Huang
    Lan Wang
    Huiran Zhang
    Zesu Cai
    Applied Intelligence, 2023, 53 : 30455 - 30468
  • [22] Application of Multimodal Data Fusion Attentive Dual Residual Generative Adversarial Network in Sentiment Recognition and Sentiment Analysis
    Zhang, Yongfang
    Fan, Hongxing
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 2310 - 2320
  • [23] BAFN: Bi-Direction Attention Based Fusion Network for Multimodal Sentiment Analysis
    Tang, Jiajia
    Liu, Dongjun
    Jin, Xuanyu
    Peng, Yong
    Zhao, Qibin
    Ding, Yu
    Kong, Wanzeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1966 - 1978
  • [24] Heterogeneous Hierarchical Fusion Network for Multimodal Sentiment Analysis in Real-World Environments
    Huang, Ju
    Chen, Wenkang
    Wang, Fangyi
    Zhang, Haijun
    ELECTRONICS, 2024, 13 (20)
  • [25] Fcdnet: Fuzzy Cognition-Based Dynamic Fusion Network for Multimodal Sentiment Analysis
    Liu, Shuai
    Luo, Zhe
    Fu, Weina
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2025, 33 (01) : 3 - 14
  • [26] TCHFN: Multimodal sentiment analysis based on Text-Centric Hierarchical Fusion Network
    Hou, Jingming
    Omar, Nazlia
    Tiun, Sabrina
    Saad, Saidah
    He, Qian
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [27] Dual-Perspective Fusion Network for Aspect-Based Multimodal Sentiment Analysis
    Wang, Di
    Tian, Changning
    Liang, Xiao
    Zhao, Lin
    He, Lihuo
    Wang, Quan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 (4028-4038) : 4028 - 4038
  • [28] Text-centered cross-sample fusion network for multimodal sentiment analysis
    Huang, Qionghao
    Chen, Jili
    Huang, Changqin
    Huang, Xiaodi
    Wang, Yi
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [29] MSFNet: modality smoothing fusion network for multimodal aspect-based sentiment analysis
    Xiang, Yan
    Cai, Yunjia
    Guo, Junjun
    FRONTIERS IN PHYSICS, 2023, 11
  • [30] TeFNA: Text-centered fusion network with crossmodal attention for multimodal sentiment analysis
    Huang, Changqin
    Zhang, Junling
    Wu, Xuemei
    Wang, Yi
    Li, Ming
    Huang, Xiaodi
    KNOWLEDGE-BASED SYSTEMS, 2023, 269