GNN-Based Multimodal Named Entity Recognition

被引:2
|
作者
Gong, Yunchao [1 ,2 ,3 ]
Lv, Xueqiang [1 ,2 ]
Yuan, Zhu [1 ,2 ]
You, Xindong [2 ]
Hu, Feng [1 ,3 ]
Chen, Yuzhong [1 ,3 ]
机构
[1] Qinghai Normal Univ, Coll Comp, 38 Wusi West Rd, Xining 810008, Qinghai, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Beijing Key Lab Internet Culture & Digital Dissemi, 35 Beisihuanzhong Rd, Beijing 100101, Peoples R China
[3] Qinghai Normal Univ, State Key Lab Tibetan Intelligent Informat Proc &, 38 Wusi West Rd, Xining 810008, Qinghai, Peoples R China
来源
COMPUTER JOURNAL | 2024年 / 67卷 / 08期
基金
中国国家自然科学基金;
关键词
multimodality; named entity recognition; multimodal interaction graph; graph neural network; FUSION;
D O I
10.1093/comjnl/bxae030
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Multimodal Named Entity Recognition (MNER) task enhances the text representations and improves the accuracy and robustness of named entity recognition by leveraging visual information from images. However, previous methods have two limitations: (i) the semantic mismatch between text and image modalities makes it challenging to establish accurate internal connections between words and visual representations. Besides, the limited number of characters in social media posts leads to semantic and contextual ambiguity, further exacerbating the semantic mismatch between modalities. (ii) Existing methods employ cross-modal attention mechanisms to facilitate interaction and fusion between different modalities, overlooking fine-grained correspondences between semantic units of text and images. To alleviate these issues, we propose a graph neural network approach for MNER (GNN-MNER), which promotes fine-grained alignment and interaction between semantic units of different modalities. Specifically, to mitigate the issue of semantic mismatch between modalities, we construct corresponding graph structures for text and images, and leverage graph convolutional networks to augment text and visual representations. For the second issue, we propose a multimodal interaction graph to explicitly represent the fine-grained semantic correspondences between text and visual objects. Based on this graph, we implement deep-level feature fusion between modalities utilizing graph attention networks. Compared with existing methods, our approach is the first to extend graph deep learning throughout the MNER task. Extensive experiments on the Twitter multimodal datasets validate the effectiveness of our GNN-MNER.
引用
收藏
页码:2622 / 2632
页数:11
相关论文
共 50 条
  • [41] Fine-Grained Multimodal Named Entity Recognition and Grounding with a Generative Framework
    Wang, Jieming
    Li, Ziyan
    Yu, Jianfei
    Yang, Li
    Xia, Rui
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3934 - 3943
  • [42] Chinese medical named entity recognition based on multi-granularity semantic dictionary and multimodal tree
    Wang, Caiyu
    Wang, Hong
    Zhuang, Hui
    Li, Wei
    Han, Shu
    Zhang, Hui
    Zhuang, Luhe
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 111 (111)
  • [43] Multimodal Named Entity Recognition and Relation Extraction with Retrieval-Augmented Strategy
    Hu, Xuming
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 3488 - 3488
  • [44] A Named Entity Recognition Model Based on Entity Trigger Reinforcement Learning
    Wang, Ping
    Si, Nong
    Tong, Haopeng
    2022 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE (CCAI 2022), 2022, : 43 - 48
  • [45] A GNN-based predictor for quantum architecture search
    Zhimin He
    Xuefen Zhang
    Chuangtao Chen
    Zhiming Huang
    Yan Zhou
    Haozhen Situ
    Quantum Information Processing, 22
  • [46] GNN-Based Embedded Framework for Consumer Affect Recognition Using Thermal Facial ROIs
    Nayak, Satyajit
    Routray, Aurobinda
    Sarma, Monalisa
    Uttarkabat, Satarupa
    IEEE CONSUMER ELECTRONICS MAGAZINE, 2023, 12 (04) : 74 - 83
  • [47] Scalable Verification of GNN-Based Job Schedulers
    Wu, Haoze
    Barrett, Clark
    Sharif, Mahmood
    Narodytska, Nina
    Singh, Gagandeep
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2022, 6 (OOPSLA):
  • [48] Named Entity Recognition for Vietnamese
    Dat Ba Nguyen
    Son Huu Hoang
    Son Bao Pham
    Thai Phuong Nguyen
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, PROCEEDINGS, 2010, 5991 : 205 - 214
  • [49] Persian Named Entity Recognition
    Dashtipour, Kia
    Gogate, Mandar
    Adeel, Ahsan
    Algarafi, Abdulrahman
    Howard, Newton
    Hussain, Amir
    2017 IEEE 16TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2017, : 79 - 83
  • [50] Named Entity Recognition for Tweets
    Liu, Xiaohua
    Wei, Furu
    Zhang, Shaodian
    Zhou, Ming
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 4 (01)