GMNER-LF: Generative Multi-modal Named Entity Recognition Based on LLM with Information Fusion

被引：0

作者：

Hu, Huiyun ^{[1
,2
]}

Kong, Junda ^{[1
,2
]}

Wang, Fei ^{[2
]}

Sun, Hongzhi ^{[2
]}

Ge, Yang ^{[2
]}

Xiao, Bo ^{[1
]}

机构：

[1] Dezhou Power Supply Co, State Grid Shandong Elect Power Co, Dezhou, Peoples R China

[2] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

2024 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2024年

关键词：

D O I：

10.1109/APSIPAASC63619.2025.10848846

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-modal Named Entity Recognition (MNER) leverages visual information to enhance the effectiveness of text-only Named Entity Recognition (NER). Currently, many methods are based on sequence label, but the model architecture is relatively complex. Large Language Model (LLM) has recently demonstrated powerful generative and comprehension abilities, so we propose GMNER-LF, a new paradigm for MNER with generative method. Firstly, we retrieve relevant image-text pairs to provide prior knowledge for the recognition. Secondly, we construct our task into a MRC task so that the LLM can better understand the problem. In addition, we design a multi-modal fusion module and add a gating mechanism to help filter noise information in the image to obtain high-quality fusion representations. The multi-modal fusion module is injected into the LLM block to achieve deep fusion of LLM and multi-modal representations and fully explore the internal knowledge of LLM. The proposed method not only fully explores the internal knowledge of LLM, but also filters important modal information through the gating mechanism. Experimental results show that compared with other generative methods, the proposed method has improved performance on both Twitter-2015 and Twitter-2017 datasets.

引用

页数：6

共 50 条

[1] Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance
Zhang, Dong
Wei, Suzhong
Li, Shoushan
Wu, Hanqian
Zhu, Qiaoming
Zhou, Guodong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14347 - 14355
[2] Cybersecurity Named Entity Recognition Using Multi-Modal Ensemble Learning
Yi, Feng
Jiang, Bo
Wang, Lu
Wu, Jianjun
IEEE ACCESS, 2020, 8 : 63214 - 63224
[3] MMAF: Masked Multi-modal Attention Fusion to Reduce Bias of Visual Features for Named Entity Recognition
Jinhui Pang
Xinyun Yang
Xiaoyao Qiu
Zixuan Wang
Taisheng Huang
Data Intelligence, 2024, 6 (04) : 1114 - 1133
[4] ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition
Wang, Xinyu
Gui, Min
Jiang, Yong
Jia, Zixia
Bach, Nguyen
Wang, Tao
Huang, Zhongqiang
Huang, Fei
Tu, Kewei
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3176 - 3189
[5] Named Entity Recognition of Diseases and Insect Pests Based on Multi Source Information Fusion
Li L.
Zhou H.
Guo X.
Liu C.
Su J.
Tang Z.
Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (12): : 253 - 263
[6] A LLM-Based Robot Partner with Multi-modal Emotion Recognition
Jiang, Yutong
Shao, Shuai
Dai, Yaping
Hirota, Kaoru
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT X, 2025, 15210 : 71 - 83
[7] RSRNeT: a novel multi-modal network framework for named entity recognition and relation extraction
Wang, Min
Chen, Hongbin
Shen, Dingcai
Li, Baolei
Hu, Shiyu
PEERJ COMPUTER SCIENCE, 2024, 10
[8] 3D shape recognition based on multi-modal information fusion
Qi Liang
Mengmeng Xiao
Dan Song
Multimedia Tools and Applications, 2021, 80 : 16173 - 16184
[9] 3D shape recognition based on multi-modal information fusion
Liang, Qi
Xiao, Mengmeng
Song, Dan
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16173 - 16184
[10] Generative-Based Fusion Mechanism for Multi-Modal Tracking
Tang, Zhangyong
Xu, Tianyang
Wu, Xiaojun
Zhu, Xue-Feng
Kittler, Josef
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5189 - 5197

← 1 2 3 4 5 →