Combining Knowledge and Multi-modal Fusion for Meme Classification

被引：4

作者：

Zhong, Qi ^{[1
]}

Wang, Qian ^{[1
]}

Liu, Ji ^{[1
]}

机构：

[1] Chongqing Univ, Coll Comp Sci, Chongqing 400044, Peoples R China

来源：

MULTIMEDIA MODELING (MMM 2022), PT I | 2022年 / 13141卷

关键词：

Meme classification; Multi-modal fusion; Self-attention mechanism;

D O I：

10.1007/978-3-030-98358-1_47

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Internet memes are widespread on social media platforms such as Twitter and Facebook. Recently, meme classification has been an active research topic, especially meme sentiment classification and meme offensive classification. Internet memes contain multi-modal information, and the meme text is embedded in the meme image. The existing methods classify memes by simply concatenating global visual and textual features to generate a multi-modal representation. However, these approaches ignored the noise introduced by global visual features and the potential common information of meme multi-modal representation. In this paper, we propose a model for meme classification named MeBERT. Our method enhances the semantic representation of the meme by introducing conceptual information through external Knowledge Bases (KBs). Then, to reduce noise, a concept-image attention module is designed to extract concept-sensitive visual representation. In addition, a deep convolution tensor fusion module is built to effectively integrate multi-modal information. To verify the effectiveness of the model in the tasks of meme sentiment classification and meme offensive classification, we designed experiments on the Memotion and MultiOFF datasets. The experimental results show that the MeBERT model achieves better performance than state-of-the-art techniques for meme classification.

引用

页码：599 / 611

页数：13

共 50 条

[21] Hierarchical Multi-Modal Prompting Transformer for Multi-Modal Long Document Classification
Liu, Tengfei
Hu, Yongli
Gao, Junbin
Sun, Yanfeng
Yin, Baocai
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6376 - 6390
[22] Classification of multi-modal remote sensing images based on knowledge graph
Fang, Jianyong
Yan, Xuefeng
[J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (15) : 4815 - 4835
[23] Soft multi-modal data fusion
Coppock, S
Mazack, L
[J]. PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 636 - 641
[24] Multi-modal data fusion: A description
Coppock, S
Mazlack, LJ
[J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 1136 - 1142
[25] Multi-modal fusion for video understanding
Hoogs, A
Mundy, J
Cross, G
[J]. 30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 103 - 108
[26] MULTI-MODAL APPROACH TO INDEXING AND CLASSIFICATION
SWIFT, DF
WINN, VA
BRAMER, DA
[J]. INTERNATIONAL CLASSIFICATION, 1977, 4 (02): : 90 - 94
[27] Multi-modal Semantic Place Classification
Pronobis, A.
Mozos, O. Martinez
Caputo, B.
Jensfelt, P.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (2-3): : 298 - 320
[28] Listening and speaking knowledge fusion network for multi-modal emotion recognition in conversation
Liu, Qin
Xie, Jun
Hu, Yong
Hao, Shu-Feng
Hao, Ya-Hui
[J]. Kongzhi yu Juece/Control and Decision, 2024, 39 (06): : 2031 - 2040
[29] Deep learning supported breast cancer classification with multi-modal image fusion
Hamdy, Eman
Zaghloul, Mohamed Saad
Badawy, Osama
[J]. 2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 319 - 325
[30] Densely Convolutional Networks for Breast Cancer Classification with Multi-Modal Image Fusion
Hamdy, Eman
Badawy, Osama
Zaghloul, Mohamed
[J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (3A) : 463 - 469

← 1 2 3 4 5 →