Scanning, attention, and reasoning multimodal content for sentiment analysis

被引:5
|
作者
Liu, Yun [1 ]
Li, Zhoujun [2 ]
Zhou, Ke [1 ]
Zhang, Leilei [1 ]
Li, Lang [1 ]
Tian, Peng [1 ]
Shen, Shixun [1 ]
机构
[1] Moutai Inst, Dept Automat, Renhuai 564507, Guizhou Provinc, Peoples R China
[2] Beihang Univ, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal sentiment analysis; Attention; Reasoning; FUSION;
D O I
10.1016/j.knosys.2023.110467
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rise of social networks has provided people with platforms to display their lives and emotions, often in multimodal forms such as images and descriptive texts. Capturing the emotions embedded in the multimodal content of social networks involves great research challenges and practical values. Existing methods usually make sentiment predictions based on a single-round reasoning process with multimodal attention networks, however, this may be insufficient for tasks that require deep understanding and complex reasoning. To effectively comprehend multimodal content and predict the correct sentiment tendencies, we propose the Scanning, Attention, and Reasoning (SAR) model for multimodal sentiment analysis. Specifically, a perceptual scanning model is designed to roughly perceive the image and text content, as well as the intrinsic correlation between them. To deeply understand the complementary features between images and texts, an intensive attention model is proposed for cross-modal feature association learning. The multimodal joint features from the scanning and attention models are fused together as the representation of a multimodal node in the social network. A heterogeneous reasoning model implemented with a graph neural network is constructed to capture the influence of network communication in social networks and make sentiment predictions. Extensive experiments conducted on three benchmark datasets confirm the effectiveness and superiority of our model compared with state-of-the-art methods.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Stacked Latent Attention for Multimodal Reasoning
    Fan, Haoqi
    Zhou, Jiatong
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1072 - 1080
  • [22] Multi-Level Attention Map Network for Multimodal Sentiment Analysis
    Xue, Xiaojun
    Zhang, Chunxia
    Niu, Zhendong
    Wu, Xindong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 5105 - 5118
  • [23] Multimodal Sentiment Analysis Based on Attention Mechanism and Tensor Fusion Network
    Zhang, Kang
    Geng, Yushui
    Zhao, Jing
    Li, Wenxiao
    Liu, Jianxin
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1473 - 1477
  • [24] Dynamically Shifting Multimodal Representations via Hybrid-Modal Attention for Multimodal Sentiment Analysis
    Lin, Ronghao
    Hu, Haifeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2740 - 2755
  • [25] The Weighted Cross-Modal Attention Mechanism With Sentiment Prediction Auxiliary Task for Multimodal Sentiment Analysis
    Chen, Qiupu
    Huang, Guimin
    Wang, Yabing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2689 - 2695
  • [26] A lexicon-based approach for sentiment analysis of multimodal content in tweets
    Prabakaran Thangavel
    Ravi Lourdusamy
    Multimedia Tools and Applications, 2023, 82 : 24203 - 24226
  • [27] Social Image Sentiment Analysis by Exploiting Multimodal Content and Heterogeneous Relations
    Xu, Jie
    Li, Zhoujun
    Huang, Feiran
    Li, Chaozhuo
    Yu, Philip S.
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (04) : 2974 - 2982
  • [28] A lexicon-based approach for sentiment analysis of multimodal content in tweets
    Thangavel, Prabakaran
    Lourdusamy, Ravi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (16) : 24203 - 24226
  • [29] A multimodal fusion network with attention mechanisms for visual-textual sentiment analysis
    Gan, Chenquan
    Fu, Xiang
    Feng, Qingdong
    Zhu, Qingyi
    Cao, Yang
    Zhu, Ye
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 242
  • [30] SCANET: Improving multimodal representation and fusion with sparse- and cross-attention for multimodal sentiment analysis
    Wang, Hao
    Yang, Mingchuan
    Li, Zheng
    Liu, Zhenhua
    Hu, Jie
    Fu, Ziwang
    Liu, Feng
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)