Improving biomedical entity linking for complex entity mentions with LLM-based text simplification

被引:0
|
作者
Borchert, Florian [1 ]
Llorca, Ignacio [1 ]
Schapranow, Matthieu-P [1 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst Digital Engn, Prof Dr Helmert Str 2-3, D-14482 Potsdam, Germany
关键词
SYSTEM;
D O I
10.1093/database/baae067
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large amounts of important medical information are captured in free-text documents in biomedical research and within healthcare systems, which can be made accessible through natural language processing (NLP). A key component in most biomedical NLP pipelines is entity linking, i.e. grounding textual mentions of named entities to a reference of medical concepts, usually derived from a terminology system, such as the Systematized Nomenclature of Medicine Clinical Terms. However, complex entity mentions, spanning multiple tokens, are notoriously hard to normalize due to the difficulty of finding appropriate candidate concepts. In this work, we propose an approach to preprocess such mentions for candidate generation, building upon recent advances in text simplification with generative large language models. We evaluate the feasibility of our method in the context of the entity linking track of the BioCreative VIII SympTEMIST shared task. We find that instructing the latest Generative Pre-trained Transformer model with a few-shot prompt for text simplification results in mention spans that are easier to normalize. Thus, we can improve recall during candidate generation by 2.9 percentage points compared to our baseline system, which achieved the best score in the original shared task evaluation. Furthermore, we show that this improvement in recall can be fully translated into top-1 accuracy through careful initialization of a subsequent reranking model. Our best system achieves an accuracy of 63.6% on the SympTEMIST test set. The proposed approach has been integrated into the open-source xMEN toolkit, which is available online via https://github.com/hpi-dhc/xmen.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] An Unsupervised Method for Linking Entity Mentions in Chinese Text
    Xu, Jing
    Gan, Liang
    Zhou, Bin
    Wu, Quanyuan
    [J]. ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 183 - 195
  • [2] Aligning and linking entity mentions in image, text, and knowledge base
    Dost, Shahi
    Serafini, Luciano
    Rospocher, Marco
    Ballan, Lamberto
    Sperduti, Alessandro
    [J]. DATA & KNOWLEDGE ENGINEERING, 2022, 138
  • [3] Exploiting anonymous entity mentions for named entity linking
    Feng Hou
    Ruili Wang
    See-Kiong Ng
    Michael Witbrock
    Fangyi Zhu
    Xiaoyun Jia
    [J]. Knowledge and Information Systems, 2023, 65 : 1221 - 1242
  • [4] Exploiting anonymous entity mentions for named entity linking
    Hou, Feng
    Wang, Ruili
    Ng, See-Kiong
    Witbrock, Michael
    Zhu, Fangyi
    Jia, Xiaoyun
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (03) : 1221 - 1242
  • [5] Improving Entity Linking by Modeling Latent Relations between Mentions
    Phong Le
    Titov, Ivan
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1595 - 1604
  • [6] BELHD: improving biomedical entity linking with homonym disambiguation
    Garda, Samuele
    Leser, Ulf
    [J]. BIOINFORMATICS, 2024, 40 (08)
  • [7] Entity linking for biomedical literature
    Jin G Zheng
    Daniel Howsmon
    Boliang Zhang
    Juergen Hahn
    Deborah McGuinness
    James Hendler
    Heng Ji
    [J]. BMC Medical Informatics and Decision Making, 15
  • [8] Entity linking for biomedical literature
    Zheng, Jin G.
    Howsmon, Daniel
    Zhang, Boliang
    Hahn, Juergen
    McGuinness, Deborah
    Hendler, James
    Ji, Heng
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2015, 15
  • [9] Collective Entity Linking on Relational Graph Model with Mentions
    Gong, Jing
    Feng, Chong
    Liu, Yong
    Shi, Ge
    Huang, Heyan
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 159 - 171
  • [10] Chinese Short Text Entity Linking Based On Semantic Similarity and Entity Correlation
    Zhao, Yan
    Wang, Yun
    Yang, Na
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 426 - 431