Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text

被引:0
|
作者
Varma, Maya [1 ]
Orr, Laurel [1 ]
Wu, Sen [1 ]
Leszczynski, Megan [1 ]
Ling, Xiao [2 ]
Re, Christopher [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Apple, Cupertino, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named entity disambiguation (NED), which involves mapping textual mentions to structured entities, is particularly challenging in the medical domain due to the presence of rare entities. Existing approaches are limited by the presence of coarse-grained structural resources in biomedical knowledge bases as well as the use of training datasets that provide low coverage over uncommon resources. In this work, we address these issues by proposing a cross-domain data integration method that transfers structural knowledge from a general text knowledge base to the medical domain. We utilize our integration scheme to augment structural resources and generate a large biomedical NED dataset for pretraining. Our pretrained model with injected structural knowledge achieves state-of-the-art performance on two benchmark medical NED datasets: MedMentions and BC5CDR. Furthermore, we improve disambiguation of rare entities by up to 57 accuracy points.
引用
收藏
页码:4566 / 4575
页数:10
相关论文
共 50 条
  • [41] Computational Reproducibility of Named Entity Recognition methods in the biomedical domain
    Garcia-Serrano, Ana
    Hennig, Sebastian
    Nuernberger, Andreas
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2021, (66): : 141 - 152
  • [42] A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media
    He, Hangfeng
    Sun, Xu
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3216 - 3222
  • [43] Named entity recognition and classification in biomedical text using classifier ensemble
    Saha, Sriparna
    Ekbal, Asif
    Sikdar, Utpal Kumar
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (04) : 365 - 391
  • [44] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [45] Gilda: biomedical entity text normalization with machine-learned disambiguation as a service
    Gyori, Benjamin M.
    Hoyt, Charles Tapley
    Steppi, Albert
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [46] A Logic-Based Approach to Named-Entity Disambiguation in the Web of Data
    Giannini, Silvia
    Colucci, Simona
    Donini, Francesco M.
    Di Sciascio, Eugenio
    AI*IA 2015: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2015, 9336 : 367 - 380
  • [47] Augmenting biomedical named entity recognition with general-domain resources
    Yin, Yu
    Kim, Hyunjae
    Xiao, Xiao
    Wei, Chih Hsuan
    Kang, Jaewoo
    Lu, Zhiyong
    Xu, Hua
    Fang, Meng
    Chen, Qingyu
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 159
  • [48] A Deep Learning-Based Named Entity Recognition in Biomedical Domain
    Gopalakrishnan, Athira
    Soman, K. P.
    Premjith, B.
    EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 517 - 526
  • [49] Cross-Domain Data Fusion
    Yang, Qiang
    COMPUTER, 2016, 49 (04) : 18 - 18
  • [50] Iterative Reinforcement Cross-Domain Text Classification
    Zhang, Di
    Xue, Gui-Rong
    Yu, Yong
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 282 - 293