Beyond visual cues: Emotion recognition in images with text-aware fusion☆

被引:1
|
作者
Sungur, Kerim Serdar [1 ]
Bakal, Gokhan [1 ]
机构
[1] Abdullah Gul Univ, Dept Comp Engn, Erkilet Blvd Sumer Campus, TR-38080 Kayseri, Turkiye
关键词
Sentiment analysis; Hybrid model; Image & text processing; Deep learning; SENTIMENT ANALYSIS;
D O I
10.1016/j.displa.2024.102958
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis is a widely studied problem for understanding human emotions and potential outcomes. As it can be performed over textual data, working on visual data elements is also critically substantial to examining the current emotional status. In this effort, the aim is to investigate any potential enhancements in sentiment analysis predictions through visual instances by integrating textual data as additional knowledge reflecting the contextual information of the images. Thus, two separate models have been developed as image-processing and text-processing models in which both models were trained on distinct datasets comprising the same five human emotions. Following, the outputs of the individual models' last dense layers are combined to construct the hybrid multimodel empowered by visual and textual components. The fundamental focus is to evaluate the performance of the hybrid model in which the textual knowledge is concatenated with visual data. Essentially, the hybrid model achieved nearly a 3% F1-score improvement compared to the plain image classification model utilizing convolutional neural network architecture. In essence, this research underscores the potency of fusing textual context with visual information to refine sentiment analysis predictions. The findings not only emphasize the potential of a multi-modal approach but also spotlight a promising avenue for future advancements in emotion analysis and understanding.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Automated Medical Image Modality Recognition by Fusion of Visual and Text Information
    Codella, Noel
    Connell, Jonathan
    Pankanti, Sharath
    Merler, Michele
    Smith, John R.
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2014, PT II, 2014, 8674 : 487 - 495
  • [32] Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition
    Zhao, Huan
    Li, Bo
    Zhang, Zixing
    INTERSPEECH 2023, 2023, : 2718 - 2722
  • [33] Multimodal fusion: A study on speech-text emotion recognition with the integration of deep learning
    Shang, Yanan
    Fu, Tianqi
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 24
  • [34] Audio-Visual Fusion Network Based on Conformer for Multimodal Emotion Recognition
    Guo, Peini
    Chen, Zhengyan
    Li, Yidi
    Liu, Hong
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT II, 2022, 13605 : 315 - 326
  • [35] Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
    Wei, Jie
    Hu, Guanyu
    Yang, Xinyu
    Luu, Anh Tuan
    Dong, Yizhuo
    INTERSPEECH 2022, 2022, : 1988 - 1992
  • [36] Unsupervised Fuzzy Inference System for Speech Emotion Recognition using audio and text cues (Workshop Paper)
    Vashishtha, Srishti
    Susan, Seba
    2020 IEEE SIXTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2020), 2020, : 394 - 403
  • [37] Fusion of Visual and Textual Features for Table Header Detection in Handwritten Text Images
    Salazar, Addisson
    Prieto, Jose Ramon
    Vidal, Enrique
    Safont, Gonzalo
    Vergara, Luis
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 1560 - 1566
  • [38] A Plant Disease Recognition Method Based on Fusion of Images and Graph Structure Text
    Wang, Chunshan
    Zhou, Ji
    Zhang, Yan
    Wu, Huarui
    Zhao, Chunjiang
    Teng, Guifa
    Li, Jiuxi
    FRONTIERS IN PLANT SCIENCE, 2022, 12
  • [39] VOLTER: Visual Collaboration and Dual-Stream Fusion for Scene Text Recognition
    Li, Jia-Nan
    Liu, Xiao-Qian
    Luo, Xin
    Xu, Xin-Shun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6437 - 6448
  • [40] SPEECH EMOTION RECOGNITION WITH GLOBAL-AWARE FUSION ON MULTI-SCALE FEATURE REPRESENTATION
    Zhu, Wenjing
    Li, Xiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6437 - 6441