Transferring From Textual Entailment to Biomedical Named Entity Recognition

被引:1
|
作者
Liang, Tingting [1 ]
Xia, Congying [2 ]
Zhao, Ziqiang [1 ]
Jiang, Yixuan [1 ]
Yin, Yuyu [1 ]
Yu, Philip S. [3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
[2] Salesforce Res, Palo Alto, CA 94301 USA
[3] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
中国国家自然科学基金;
关键词
Index Terms-Biomedical named entity recognition; contrastive learning; textual entailment; transfer learning;
D O I
10.1109/TCBB.2023.3236477
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.
引用
收藏
页码:2577 / 2586
页数:10
相关论文
共 50 条
  • [41] Hierarchical shared transfer learning for biomedical named entity recognition
    Zhaoying Chai
    Han Jin
    Shenghui Shi
    Siyan Zhan
    Lin Zhuo
    Yu Yang
    BMC Bioinformatics, 23
  • [42] Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents
    Francis, Sumam
    Van Landeghem, Jordy
    Moens, Marie-Francine
    INFORMATION, 2019, 10 (08)
  • [43] Faster biomedical named entity recognition based on knowledge distillation
    Hu B.
    Geng T.
    Deng G.
    Duan L.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (09): : 936 - 942
  • [44] On the Use of Knowledge Transfer Techniques for Biomedical Named Entity Recognition
    Mehmood, Tahir
    Serina, Ivan
    Lavelli, Alberto
    Putelli, Luca
    Gerevini, Alfonso
    FUTURE INTERNET, 2023, 15 (02):
  • [45] A CONDITIONAL RANDOM FIELDS APPROACH TO BIOMEDICAL NAMED ENTITY RECOGNITION
    Wang Haochang Zhao Tiejun Li Sheng Yu Hao (School of Computer Science and Technology
    Journal of Electronics(China), 2007, (06) : 838 - 844
  • [46] Clustering Based Active Learning for Biomedical Named Entity Recognition
    Han, Xu
    Kwoh, Chee Keong
    Kim, Jung-jae
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1253 - 1260
  • [47] Computational Reproducibility of Named Entity Recognition methods in the biomedical domain
    Garcia-Serrano, Ana
    Hennig, Sebastian
    Nuernberger, Andreas
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2021, (66): : 141 - 152
  • [48] Hierarchical shared transfer learning for biomedical named entity recognition
    Chai, Zhaoying
    Jin, Han
    Shi, Shenghui
    Zhan, Siyan
    Zhuo, Lin
    Yang, Yu
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [49] A Kernel-Based Approach for Biomedical Named Entity Recognition
    Patra, Rakesh
    Saha, Sujan Kumar
    SCIENTIFIC WORLD JOURNAL, 2013,
  • [50] A Comparative Study of Segment Representation for Biomedical Named Entity Recognition
    Shashirekha, H. L.
    Nayel, Hamada A.
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 1046 - 1052