Transferring From Textual Entailment to Biomedical Named Entity Recognition

被引:1
|
作者
Liang, Tingting [1 ]
Xia, Congying [2 ]
Zhao, Ziqiang [1 ]
Jiang, Yixuan [1 ]
Yin, Yuyu [1 ]
Yu, Philip S. [3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
[2] Salesforce Res, Palo Alto, CA 94301 USA
[3] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
中国国家自然科学基金;
关键词
Index Terms-Biomedical named entity recognition; contrastive learning; textual entailment; transfer learning;
D O I
10.1109/TCBB.2023.3236477
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.
引用
下载
收藏
页码:2577 / 2586
页数:10
相关论文
共 50 条
  • [1] Named Entity Recognition From Biomedical Data
    Refaat, Maged
    Rafea, Ahmed
    Gaballah, Nada
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 838 - 844
  • [2] Towards Recognition of Textual Entailment in the Biomedical Domain
    Tawfik, Noha S.
    Spruit, Marco R.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 368 - 375
  • [3] A Simple but Useful Multi-corpus Transferring Method for Biomedical Named Entity Recognition
    Li, Jiqiao
    Yuan, Chi
    Li, Zirui
    Wang, Huaiyu
    Tao, Feifei
    HEALTH INFORMATION PROCESSING, CHIP 2023, 2023, 1993 : 66 - 81
  • [4] A review of biomedical named entity recognition
    Chang, Lu
    Zhang, Ruihuan
    Lv, Jia
    Zhou, Weiguang
    Bai, Yunli
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (03) : 893 - 900
  • [5] Biomedical named entity recognition system
    Patrick, J. (jonpat@it.usyd.edu.au), 2005, School of Information Technologies
  • [6] Biomedical Named Entity Recognition Based on MCBERT
    Wang, Sai
    Yilahun, Hankiz
    Hamdulla, Askar
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 247 - 252
  • [7] Named Entity Recognition for Tamil Biomedical Documents
    Antony, Betina J.
    Mahalakshmi, G. S.
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1571 - 1577
  • [8] A Genetic Approach for Biomedical Named Entity Recognition
    Ekbal, Asif
    Saha, Sriparna
    Sikdar, Utpal Kumar
    Hasanuzzaman, Md
    22ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2010), PROCEEDINGS, VOL 2, 2010, : 354 - +
  • [9] A comparative study for biomedical named entity recognition
    Xu Wang
    Chen Yang
    Renchu Guan
    International Journal of Machine Learning and Cybernetics, 2018, 9 : 373 - 382
  • [10] Efficient methods for biomedical named entity recognition
    Chan, Shing-Kit
    Lam, Wai
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 729 - 735