Combining Knowledge and Multi-modal Fusion for Meme Classification

被引:4
|
作者
Zhong, Qi [1 ]
Wang, Qian [1 ]
Liu, Ji [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing 400044, Peoples R China
来源
关键词
Meme classification; Multi-modal fusion; Self-attention mechanism;
D O I
10.1007/978-3-030-98358-1_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Internet memes are widespread on social media platforms such as Twitter and Facebook. Recently, meme classification has been an active research topic, especially meme sentiment classification and meme offensive classification. Internet memes contain multi-modal information, and the meme text is embedded in the meme image. The existing methods classify memes by simply concatenating global visual and textual features to generate a multi-modal representation. However, these approaches ignored the noise introduced by global visual features and the potential common information of meme multi-modal representation. In this paper, we propose a model for meme classification named MeBERT. Our method enhances the semantic representation of the meme by introducing conceptual information through external Knowledge Bases (KBs). Then, to reduce noise, a concept-image attention module is designed to extract concept-sensitive visual representation. In addition, a deep convolution tensor fusion module is built to effectively integrate multi-modal information. To verify the effectiveness of the model in the tasks of meme sentiment classification and meme offensive classification, we designed experiments on the Memotion and MultiOFF datasets. The experimental results show that the MeBERT model achieves better performance than state-of-the-art techniques for meme classification.
引用
收藏
页码:599 / 611
页数:13
相关论文
共 50 条
  • [21] Hierarchical Multi-Modal Prompting Transformer for Multi-Modal Long Document Classification
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6376 - 6390
  • [22] Classification of multi-modal remote sensing images based on knowledge graph
    Fang, Jianyong
    Yan, Xuefeng
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (15) : 4815 - 4835
  • [23] Soft multi-modal data fusion
    Coppock, S
    Mazack, L
    [J]. PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 636 - 641
  • [24] Multi-modal data fusion: A description
    Coppock, S
    Mazlack, LJ
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 1136 - 1142
  • [25] Multi-modal fusion for video understanding
    Hoogs, A
    Mundy, J
    Cross, G
    [J]. 30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 103 - 108
  • [26] MULTI-MODAL APPROACH TO INDEXING AND CLASSIFICATION
    SWIFT, DF
    WINN, VA
    BRAMER, DA
    [J]. INTERNATIONAL CLASSIFICATION, 1977, 4 (02): : 90 - 94
  • [27] Multi-modal Semantic Place Classification
    Pronobis, A.
    Mozos, O. Martinez
    Caputo, B.
    Jensfelt, P.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (2-3): : 298 - 320
  • [28] Listening and speaking knowledge fusion network for multi-modal emotion recognition in conversation
    Liu, Qin
    Xie, Jun
    Hu, Yong
    Hao, Shu-Feng
    Hao, Ya-Hui
    [J]. Kongzhi yu Juece/Control and Decision, 2024, 39 (06): : 2031 - 2040
  • [29] Deep learning supported breast cancer classification with multi-modal image fusion
    Hamdy, Eman
    Zaghloul, Mohamed Saad
    Badawy, Osama
    [J]. 2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 319 - 325
  • [30] Densely Convolutional Networks for Breast Cancer Classification with Multi-Modal Image Fusion
    Hamdy, Eman
    Badawy, Osama
    Zaghloul, Mohamed
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (3A) : 463 - 469