MGICL: Multi-Grained Interaction Contrastive Learning for Multimodal Named Entity Recognition

被引:3
|
作者
Guo, Aibo [1 ]
Zhao, Xiang [1 ]
Tan, Zhen [1 ]
Xiao, Weidong [1 ]
机构
[1] Natl Univ Def Technol, Changsha, Hunan, Peoples R China
关键词
Multimodal named entity recognition; Multimodal representation; Contrastive learning; Multi-Grained interaction contrastive learning; Visual gate;
D O I
10.1145/3583780.3614967
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal Named Entity Recognition (MNER) aims to combine data from different modalities (e.g. text, images, videos, etc.) for recognition and classification of named entities, which is crucial for constructing Multimodal Knowledge Graphs (MMKGs). However, existing researches suffer from two prominant issues: over-reliance on textual features while neglecting visual features, and the lack of effective reduction of the feature space discrepancy of multimodal data. To overcome these challenges, this paper proposes a Multi-Grained Interaction Contrastive Learning framework for MNER task, namely MGICL. MGICL slices data into different granularities, i.e., sentence level/word token level for text, and image level/object level for image. By utilizing multimodal features with different granularities, the framework enables cross-contrast and narrows down the feature space discrepancy between modalities. Moreover, it facilitates the acquisition of valuable visual features by the text. Additionally, a visual gate control mechanism is introduced to dynamically select relevant visual information, thereby reducing the impact of visual noise. Experimental results demonstrate that the proposed MGICL framework satisfactorily tackles the challenges of MNER through enhancing information interaction of multimodal data and reducing the effect of noise, and hence, effectively improves the performance of MNER.
引用
收藏
页码:639 / 648
页数:10
相关论文
共 50 条
  • [1] Multi-Grained Named Entity Recognition
    Xia, Congying
    Zhang, Chenwei
    Yang, Tao
    Li, Yaliang
    Du, Nan
    Wu, Xian
    Fan, Wei
    Ma, Fenglong
    Yu, Philip
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1430 - 1440
  • [2] Multi-Grained Knowledge Distillation for Named Entity Recognition
    Zhou, Xuan
    Zhang, Xiao
    Tao, Chenyang
    Chen, Junya
    Xu, Bing
    Wang, Wei
    Xiao, Jing
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5704 - 5716
  • [3] Multi-Grained Multimodal Interaction Network for Entity Linking
    Luo, Pengfei
    Xu, Tong
    Wu, Shiwei
    Zhu, Chen
    Xu, Linli
    Chen, Enhong
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1583 - 1594
  • [4] Multimodal Named Entity Recognition with Bottleneck Fusion and Contrastive Learning
    Wang, Peng
    Chen, Xiaohang
    Shang, Ziyu
    Ke, Wenjun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (04) : 545 - 555
  • [5] AERNs: Attention-Based Entity Region Networks for Multi-Grained Named Entity Recognition
    Dai, Jianghai
    Feng, Chong
    Bai, Xuefeng
    Dai, Jinming
    Zhang, Huanhuan
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 408 - 415
  • [6] Contrastive Pre-training with Multi-level Alignment for Grounded Multimodal Named Entity Recognition
    Bao, Xigang
    Tian, Mengyuan
    Wang, Luyao
    Zha, Zhiyuan
    Qin, Biao
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 795 - 803
  • [7] Chinese Named Entity Recognition Based on Template and Contrastive Learning
    Zhu, Jingjing
    Cai, Tianyu
    Zhao, Zhenyu
    Ju, Shenggen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 : 392 - 405
  • [8] Fine-Grained Multimodal Named Entity Recognition and Grounding with a Generative Framework
    Wang, Jieming
    Li, Ziyan
    Yu, Jianfei
    Yang, Li
    Xia, Rui
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3934 - 3943
  • [9] Dual Contrastive Learning for Cross-Domain Named Entity Recognition
    Xu, Jingyun
    Yu, Junnan
    Cai, Yi
    Chua, Tat-Seng
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (06)
  • [10] A Survey on Multimodal Named Entity Recognition
    Qian, Shenyi
    Jin, Wenduo
    Chen, Yonggang
    Ma, Jiangtao
    Qiao, Yaqiong
    Lu, Jinyu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 609 - 622