UEFN: Efficient uncertainty estimation fusion network for reliable multimodal sentiment analysis

被引：0

作者：

Wang, Shuai ^{[1
,2
]}

Ratnavelu, K. ^{[2
]}

Bin Shibghatullah, Abdul Samad ^{[2
,3
]}

机构：

[1] Yuncheng Univ, Shanxi Prov Optoelect Informat Sci & Technol Lab, Yuncheng 044000, Peoples R China

[2] UCSI Univ, Inst Comp Sci & Digital Innovat, Fac Appl Sci, 1 Jalan Menara Gading, Cheras 56000, Kuala Lumpur, Malaysia

[3] Univ Tenaga Nas, Coll Engn, Jalan Kajang Puchong, Kajang 43009, Selangor, Malaysia

来源：

APPLIED INTELLIGENCE | 2025年 / 55卷 / 02期

关键词：

Uncertainty estimation; Multimodal representation; Sentiment analysis; Decision fusion; Social media;

D O I：

10.1007/s10489-024-06113-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The rapid evolution of the digital era has greatly transformed social media, resulting in more diverse emotional expressions and increasingly complex public discourse. Consequently, identifying relationships within multimodal data has become increasingly challenging. Most current multimodal sentiment analysis (MSA) methods concentrate on merging data from diverse modalities into an integrated feature representation to enhance recognition performance by leveraging the complementary nature of multimodal data. However, these approaches often overlook prediction reliability. To address this, we propose the uncertainty estimation fusion network (UEFN), a reliable MSA method based on uncertainty estimation. UEFN combines the Dirichlet distribution and Dempster-Shafer evidence theory (DSET) to predict the probability distribution and uncertainty of text, speech, and image modalities, fusing the predictions at the decision level. Specifically, the method first represents the contextual features of text, speech, and image modalities separately. It then employs a fully connected neural network to transform features from different modalities into evidence forms. Subsequently, it parameterizes the evidence of different modalities via the Dirichlet distribution and estimates the probability distribution and uncertainty for each modality. Finally, we use DSET to fuse the predictions, obtaining the sentiment analysis results and uncertainty estimation, referred to as the multimodal decision fusion layer (MDFL). Additionally, on the basis of the modality uncertainty generated by subjective logic theory, we calculate feature weights, apply them to the corresponding features, concatenate the weighted features, and feed them into a feedforward neural network for sentiment classification, forming the adaptive weight fusion layer (AWFL). Both MDFL and AWFL are then used for multitask training. Experimental comparisons demonstrate that the UEFN not only achieves excellent performance but also provides uncertainty estimation along with the predictions, enhancing the reliability and interpretability of the results.

引用

页数：20

共 50 条

[21] A graph convolution-based heterogeneous fusion network for multimodal sentiment analysis
Tong Zhao
Junjie Peng
Yansong Huang
Lan Wang
Huiran Zhang
Zesu Cai
Applied Intelligence, 2023, 53 : 30455 - 30468
[22] Application of Multimodal Data Fusion Attentive Dual Residual Generative Adversarial Network in Sentiment Recognition and Sentiment Analysis
Zhang, Yongfang
Fan, Hongxing
JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 2310 - 2320
[23] BAFN: Bi-Direction Attention Based Fusion Network for Multimodal Sentiment Analysis
Tang, Jiajia
Liu, Dongjun
Jin, Xuanyu
Peng, Yong
Zhao, Qibin
Ding, Yu
Kong, Wanzeng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1966 - 1978
[24] Heterogeneous Hierarchical Fusion Network for Multimodal Sentiment Analysis in Real-World Environments
Huang, Ju
Chen, Wenkang
Wang, Fangyi
Zhang, Haijun
ELECTRONICS, 2024, 13 (20)
[25] Fcdnet: Fuzzy Cognition-Based Dynamic Fusion Network for Multimodal Sentiment Analysis
Liu, Shuai
Luo, Zhe
Fu, Weina
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2025, 33 (01) : 3 - 14
[26] TCHFN: Multimodal sentiment analysis based on Text-Centric Hierarchical Fusion Network
Hou, Jingming
Omar, Nazlia
Tiun, Sabrina
Saad, Saidah
He, Qian
KNOWLEDGE-BASED SYSTEMS, 2024, 300
[27] Dual-Perspective Fusion Network for Aspect-Based Multimodal Sentiment Analysis
Wang, Di
Tian, Changning
Liang, Xiao
Zhao, Lin
He, Lihuo
Wang, Quan
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 (4028-4038) : 4028 - 4038
[28] Text-centered cross-sample fusion network for multimodal sentiment analysis
Huang, Qionghao
Chen, Jili
Huang, Changqin
Huang, Xiaodi
Wang, Yi
MULTIMEDIA SYSTEMS, 2024, 30 (04)
[29] MSFNet: modality smoothing fusion network for multimodal aspect-based sentiment analysis
Xiang, Yan
Cai, Yunjia
Guo, Junjun
FRONTIERS IN PHYSICS, 2023, 11
[30] TeFNA: Text-centered fusion network with crossmodal attention for multimodal sentiment analysis
Huang, Changqin
Zhang, Junling
Wu, Xuemei
Wang, Yi
Li, Ming
Huang, Xiaodi
KNOWLEDGE-BASED SYSTEMS, 2023, 269

← 1 2 3 4 5 →