GNN-Based Multimodal Named Entity Recognition

被引:2
|
作者
Gong, Yunchao [1 ,2 ,3 ]
Lv, Xueqiang [1 ,2 ]
Yuan, Zhu [1 ,2 ]
You, Xindong [2 ]
Hu, Feng [1 ,3 ]
Chen, Yuzhong [1 ,3 ]
机构
[1] Qinghai Normal Univ, Coll Comp, 38 Wusi West Rd, Xining 810008, Qinghai, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Beijing Key Lab Internet Culture & Digital Dissemi, 35 Beisihuanzhong Rd, Beijing 100101, Peoples R China
[3] Qinghai Normal Univ, State Key Lab Tibetan Intelligent Informat Proc &, 38 Wusi West Rd, Xining 810008, Qinghai, Peoples R China
来源
COMPUTER JOURNAL | 2024年 / 67卷 / 08期
基金
中国国家自然科学基金;
关键词
multimodality; named entity recognition; multimodal interaction graph; graph neural network; FUSION;
D O I
10.1093/comjnl/bxae030
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Multimodal Named Entity Recognition (MNER) task enhances the text representations and improves the accuracy and robustness of named entity recognition by leveraging visual information from images. However, previous methods have two limitations: (i) the semantic mismatch between text and image modalities makes it challenging to establish accurate internal connections between words and visual representations. Besides, the limited number of characters in social media posts leads to semantic and contextual ambiguity, further exacerbating the semantic mismatch between modalities. (ii) Existing methods employ cross-modal attention mechanisms to facilitate interaction and fusion between different modalities, overlooking fine-grained correspondences between semantic units of text and images. To alleviate these issues, we propose a graph neural network approach for MNER (GNN-MNER), which promotes fine-grained alignment and interaction between semantic units of different modalities. Specifically, to mitigate the issue of semantic mismatch between modalities, we construct corresponding graph structures for text and images, and leverage graph convolutional networks to augment text and visual representations. For the second issue, we propose a multimodal interaction graph to explicitly represent the fine-grained semantic correspondences between text and visual objects. Based on this graph, we implement deep-level feature fusion between modalities utilizing graph attention networks. Compared with existing methods, our approach is the first to extend graph deep learning throughout the MNER task. Extensive experiments on the Twitter multimodal datasets validate the effectiveness of our GNN-MNER.
引用
收藏
页码:2622 / 2632
页数:11
相关论文
共 50 条
  • [21] Biomedical Named Entity Recognition Based on MCBERT
    Wang, Sai
    Yilahun, Hankiz
    Hamdulla, Askar
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 247 - 252
  • [22] Named entity recognition based on deep learning
    Ji Z.
    Kong D.
    Liu W.
    Dong W.
    Sang Y.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2022, 28 (06): : 1603 - 1615
  • [23] PromptMNER: Prompt-Based Entity-Related Visual Clue Extraction and Integration for Multimodal Named Entity Recognition
    Wang, Xuwu
    Tian, Junfeng
    Gui, Min
    Li, Zhixu
    Ye, Jiabo
    Yan, Ming
    Xiao, Yanghua
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 297 - 305
  • [24] Named Entity Recognition based on a Graph Structure
    Munoz, David
    Perez, Fernando
    Pinto, David
    COMPUTACION Y SISTEMAS, 2020, 24 (02): : 553 - 563
  • [25] Iterative GNN-based Decoder for Question Generation
    Fei, Zichu
    Zhang, Qi
    Zhou, Yaqian
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2573 - 2582
  • [26] Assisting Multimodal Named Entity Recognition by cross-modal auxiliary tasks
    Chen, Zhengjie
    Zhang, Yu
    Mi, Siya
    PATTERN RECOGNITION LETTERS, 2023, 175 : 52 - 58
  • [27] A GNN-based predictor for quantum architecture search
    He, Zhimin
    Zhang, Xuefen
    Chen, Chuangtao
    Huang, Zhiming
    Zhou, Yan
    Situ, Haozhen
    QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
  • [28] Robust GNN-based Representation Learning for HLS
    Sohrabizadeh, Atefeh
    Bai, Yunsheng
    Sun, Yizhou
    Cong, Jason
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [29] A GNN-Based QSPR Model for Surfactant Properties
    Ham, Seokgyun
    Wang, Xin
    Zhang, Hongwei
    Lattimer, Brian
    Qiao, Rui
    COLLOIDS AND INTERFACES, 2024, 8 (06)
  • [30] A GNN-Based Variable Partition Framework for DCOPs
    Chen, Chun
    Ning, Li
    Zhou, Rong
    Zhang, Yong
    Zhou, Chan
    Feng, Shengzhong
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022