TEDT: Transformer-Based Encoding–Decoding Translation Network for Multimodal Sentiment Analysis

被引:0
|
作者
Fan Wang
Shengwei Tian
Long Yu
Jing Liu
Junwen Wang
Kun Li
Yongtao Wang
机构
[1] University of Xinjiang,School of Software
[2] University of Xinjiang,Network and Information Center
来源
Cognitive Computation | 2023年 / 15卷
关键词
Multimodal sentiment analysis; Transformer; Multimodal fusion; Multimodal attention;
D O I
暂无
中图分类号
学科分类号
摘要
Multimodal sentiment analysis is a popular and challenging research topic in natural language processing, but the impact of individual modal data in videos on sentiment analysis results can be different. In the temporal dimension, natural language sentiment is influenced by nonnatural language sentiment, which may enhance or weaken the original sentiment of the current natural language. In addition, there is a general problem of poor quality of nonnatural language features, which essentially hinders the effect of multimodal fusion. To address the above issues, we proposed a multimodal encoding–decoding translation network with a transformer and adopted a joint encoding–decoding method with text as the primary information and sound and image as the secondary information. To reduce the negative impact of nonnatural language data on natural language data, we propose a modality reinforcement cross-attention module to convert nonnatural language features into natural language features to improve their quality and better integrate multimodal features. Moreover, the dynamic filtering mechanism filters out the error information generated in the cross-modal interaction to further improve the final output. We evaluated the proposed method on two multimodal sentiment analysis benchmark datasets (MOSI and MOSEI), and the accuracy of the method was 89.3% and 85.9%, respectively. In addition, our method outperformed the current state-of-the-art methods. Our model can greatly improve the effect of multimodal fusion and more accurately analyze human sentiment.
引用
收藏
页码:289 / 303
页数:14
相关论文
共 50 条
  • [41] Transformer-Based Interactive Multi-Modal Attention Network for Video Sentiment Detection
    Xuqiang Zhuang
    Fangai Liu
    Jian Hou
    Jianhua Hao
    Xiaohong Cai
    Neural Processing Letters, 2022, 54 : 1943 - 1960
  • [42] Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis
    Wang, Yifeng
    He, Jiahao
    Wang, Di
    Wang, Quan
    Wan, Bo
    Luo, Xuemei
    NEUROCOMPUTING, 2024, 572
  • [43] Novelty fused image and text models based on deep neural network and transformer for multimodal sentiment analysis
    Hung, Bui Thanh
    Thu, Nguyen Hoang Minh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (25) : 66263 - 66281
  • [44] Transformer-based deep learning models for the sentiment analysis of social media data
    Kokab, Sayyida Tabinda
    Asghar, Sohail
    Naz, Shehneela
    ARRAY, 2022, 14
  • [45] Enhancing the accuracy of transformer-based embeddings for sentiment analysis in social big data
    Zemzem, Wiem
    Tagina, Moncef
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2023, 73 (03) : 169 - 177
  • [46] TDFNet: Transformer-Based Deep-Scale Fusion Network for Multimodal Emotion Recognition
    Zhao, Zhengdao
    Wang, Yuhua
    Shen, Guang
    Xu, Yuezhu
    Zhang, Jiayuan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3771 - 3782
  • [47] TMSS: An End-to-End Transformer-Based Multimodal Network for Segmentation and Survival Prediction
    Saeed, Numan
    Sobirov, Ikboljon
    Al Majzoub, Roba
    Yaqub, Mohammad
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 319 - 329
  • [48] Multimodal sentiment analysis with unidirectional modality translation
    Yang, Bo
    Shao, Bo
    Wu, Lijun
    Lin, Xiaola
    NEUROCOMPUTING, 2022, 467 : 130 - 137
  • [49] A transformer-based network for speech recognition
    Tang L.
    International Journal of Speech Technology, 2023, 26 (02) : 531 - 539
  • [50] Conv-Enhanced Transformer and Robust Optimization Network for robust multimodal sentiment analysis
    Sun, Bin
    Jia, Li
    Cui, Yiming
    Wang, Na
    Jiang, Tao
    NEUROCOMPUTING, 2025, 634