Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the Wild

被引:0
|
作者
Alzamzami, Fatimah [1 ]
Saddik, Abdulmotaleb El [1 ,2 ]
机构
[1] Univ Ottawa, Sch Elect Engn & Comp Sci, Multimedia Commun Res Lab, Ottawa, ON K1N 6N5, Canada
[2] Mohamed Bin Zayed Univ Artificial Intelligence, Dept Comp Vis, Abu Dhabi, U Arab Emirates
关键词
Transformers; ViT; sentiment; online social media; transfer learning; threshold moving; tweets; images; feature extraction; multimodality; fusion; big data; deep learning; FACIAL EXPRESSION RECOGNITION; EMOTION RECOGNITION; CLASSIFICATION;
D O I
10.1109/ACCESS.2023.3274744
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present an image-based real-time sentiment analysis system that can be used to recognize in-the-wild sentiment expressions on online social networks. The system deploys the newly proposed transformer architecture on online social networks (OSN) big data to extract emotion and sentiment features using three types of images: images containing faces, images containing text, and images containing no faces/text. We build three separate models, one for each type of image, and then fuse all the models to learn the online sentiment behavior. Our proposed methodology combines a supervised two-stage training approach and threshold-moving method, which is crucial for the data imbalance found in OSN data. The training is carried out on existing popular datasets (i.e., for the three models) and our newly proposed dataset, the Domain Free Multimedia Sentiment Dataset (DFMSD). Our results showthat inducing the threshold-moving method during the training has enhanced the sentiment learning performance by 5-8% more points compared to when the training was conducted without the threshold-moving approach. Combining the two-stage strategy with the threshold-moving method during the training process, has been proven effective to further improve the learning performance (i.e. by approximate to 12% more enhanced accuracy compared to the threshold-moving strategy alone). Furthermore, the proposed approach has shown a positive learning impact on the fusion of the three models in terms of the accuracy and F-score.
引用
收藏
页码:47070 / 47079
页数:10
相关论文
共 50 条
  • [1] Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion
    Siriwardhana, Shamane
    Kaluarachchi, Tharindu
    Billinghurst, Mark
    Nanayakkara, Suranga
    [J]. IEEE ACCESS, 2020, 8 (08): : 176274 - 176285
  • [2] Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis
    Yuan, Ziqi
    Li, Wei
    Xu, Hua
    Yu, Wenmeng
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4400 - 4407
  • [3] Transformer-Based Physiological Feature Learning for Multimodal Analysis of Self-Reported Sentiment
    Katada, Shun
    Okada, Shogo
    Komatani, Kazunori
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 349 - 358
  • [4] Transformer-Based Multilingual Speech Emotion Recognition Using Data Augmentation and Feature Fusion
    Al-onazi, Badriyya B.
    Nauman, Muhammad Asif
    Jahangir, Rashid
    Malik, Muhmmad Mohsin
    Alkhammash, Eman H.
    Elshewey, Ahmed M.
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (18):
  • [5] Reliable object tracking by multimodal hybrid feature extraction and transformer-based fusion
    Sun, Hongze
    Liu, Rui
    Cai, Wuque
    Wang, Jun
    Wang, Yue
    Tang, Huajin
    Cui, Yan
    Yao, Dezhong
    Guo, Daqing
    [J]. NEURAL NETWORKS, 2024, 178
  • [6] Transformer-Based Multimodal Emotional Perception for Dynamic Facial Expression Recognition in the Wild
    Zhang, Xiaoqin
    Li, Min
    Lin, Sheng
    Xu, Hang
    Xiao, Guobao
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3192 - 3203
  • [7] Tweets Topic Classification and Sentiment Analysis Based on Transformer-Based Language Models
    Mandal, Ranju
    Chen, Jinyan
    Becken, Susanne
    Stantic, Bela
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (02) : 117 - 134
  • [8] Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion
    Xie, Baijun
    Sidulova, Mariia
    Park, Chung Hyuk
    [J]. SENSORS, 2021, 21 (14)
  • [9] TMBL: Transformer-based multimodal binding learning model for multimodal sentiment analysis
    Huang, Jiehui
    Zhou, Jun
    Tang, Zhenchao
    Lin, Jiaying
    Chen, Calvin Yu-Chian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [10] Multimodal Emotion Recognition Using Feature Fusion: An LLM-Based Approach
    Chandraumakantham, Omkumar
    Gowtham, N.
    Zakariah, Mohammed
    Almazyad, Abdulaziz
    [J]. IEEE ACCESS, 2024, 12 : 108052 - 108071