Bimodal Fusion Network with Multi-Head Attention for Multimodal Sentiment Analysis

被引:3
|
作者
Zhang, Rui [1 ,2 ]
Xue, Chengrong [1 ,2 ]
Qi, Qingfu [3 ]
Lin, Liyuan [2 ]
Zhang, Jing [1 ,2 ]
Zhang, Lun [1 ,2 ]
机构
[1] Tianjin Sino German Univ Appl Sci, Sch Software & Commun, Tianjin 300222, Peoples R China
[2] Tianjin Univ Sci & Technol, Coll Elect Informat & Automation, Tianjin 300222, Peoples R China
[3] Gaussian Robot Pte Ltd, Tianjin 200100, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 03期
关键词
multimodal sentiment analysis; bimodal fusion; multi-head attention; EMOTION RECOGNITION; FEATURES;
D O I
10.3390/app13031915
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The enrichment of social media expression makes multimodal sentiment analysis a research hotspot. However, modality heterogeneity brings great difficulties to effective cross-modal fusion, especially the modality alignment problem and the uncontrolled vector offset during fusion. In this paper, we propose a bimodal multi-head attention network (BMAN) based on text and audio, which adaptively captures the intramodal utterance features and complex intermodal alignment relationships. Specifically, we first set two independent unimodal encoders to extract the semantic features within each modality. Considering that different modalities deserve different weights, we further built a joint decoder to fuse the audio information into the text representation, based on learnable weights to avoid an unreasonable vector offset. The obtained cross-modal representation is used to improve the sentiment prediction performance. Experiments on both the aligned and unaligned CMU-MOSEI datasets show that our model achieves better performance than multiple baselines, and it has outstanding advantages in solving the problem of cross-modal alignment.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Using recurrent neural network structure with Enhanced Multi-Head Self-Attention for sentiment analysis
    Leng, Xue-Liang
    Miao, Xiao-Ai
    Liu, Tao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 12581 - 12600
  • [22] Bidirectional recurrent neural network with multi-head attention for automatic scene generation using sentiment analysis
    Dharaniya, R.
    Indumathi, J.
    Uma, G., V
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (06) : 7023 - 7039
  • [23] A transformer-encoder-based multimodal multi-attention fusion network for sentiment analysis
    Liu, Cong
    Wang, Yong
    Yang, Jing
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8415 - 8441
  • [24] Multi-layer cross-modality attention fusion network for multimodal sentiment analysis
    Yin Z.
    Du Y.
    Liu Y.
    Wang Y.
    Multimedia Tools and Applications, 2024, 83 (21) : 60171 - 60187
  • [25] Multimodal Sentiment Analysis Based on Attention Mechanism and Tensor Fusion Network
    Zhang, Kang
    Geng, Yushui
    Zhao, Jing
    Li, Wenxiao
    Liu, Jianxin
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1473 - 1477
  • [26] Gated attention fusion network for multimodal sentiment classification
    Du, Yongping
    Liu, Yang
    Peng, Zhi
    Jin, Xingnan
    KNOWLEDGE-BASED SYSTEMS, 2022, 240
  • [27] Multi-attention Fusion for Multimodal Sentiment Classification
    Li, Guangmin
    Zeng, Xin
    Chen, Chi
    Zhou, Long
    PROCEEDINGS OF 2024 ACM ICMR WORKSHOP ON MULTIMODAL VIDEO RETRIEVAL, ICMR-MVR 2024, 2024, : 1 - 7
  • [28] Filter gate network based on multi-head attention for aspect-level sentiment classification
    Zhou, Ziyu
    Liu, Fang'ai
    NEUROCOMPUTING, 2021, 441 (441) : 214 - 225
  • [29] An interactive multi-head self-attention capsule network model for aspect sentiment classification
    She, Lina
    Gong, Hongfang
    Zhang, Siyu
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (07): : 9327 - 9352
  • [30] An interactive multi-head self-attention capsule network model for aspect sentiment classification
    Lina She
    Hongfang Gong
    Siyu Zhang
    The Journal of Supercomputing, 2024, 80 : 9327 - 9352