Transferring From Textual Entailment to Biomedical Named Entity Recognition

被引:1
|
作者
Liang, Tingting [1 ]
Xia, Congying [2 ]
Zhao, Ziqiang [1 ]
Jiang, Yixuan [1 ]
Yin, Yuyu [1 ]
Yu, Philip S. [3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou 310018, Peoples R China
[2] Salesforce Res, Palo Alto, CA 94301 USA
[3] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
中国国家自然科学基金;
关键词
Index Terms-Biomedical named entity recognition; contrastive learning; textual entailment; transfer learning;
D O I
10.1109/TCBB.2023.3236477
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.
引用
收藏
页码:2577 / 2586
页数:10
相关论文
共 50 条
  • [31] Improving biomedical named entity recognition with syntactic information
    Tian, Yuanhe
    Shen, Wang
    Song, Yan
    Xia, Fei
    He, Min
    Li, Kenli
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [32] Distantly Supervised Biomedical Named Entity Recognition with Dictionary Expansion
    Wang, Xuan
    Zhang, Yu
    Li, Qi
    Ren, Xiang
    Shang, Jingbo
    Han, Jiawei
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 496 - 503
  • [33] Biomedical named entity recognition using generalized expectation criteria
    Lin Yao
    Chengjie Sun
    Yan Wu
    Xiaolong Wang
    Xuan Wang
    International Journal of Machine Learning and Cybernetics, 2011, 2 : 235 - 243
  • [34] Transfer learning for biomedical named entity recognition with neural networks
    Giorgi, John M.
    Bader, Gary D.
    BIOINFORMATICS, 2018, 34 (23) : 4087 - 4094
  • [35] Named Entity Recognition and Relation Detection for Biomedical Information Extraction
    Perera, Nadeesha
    Dehmer, Matthias
    Emmert-Streib, Frank
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2020, 8
  • [36] A Boundary Assembling Method for Nested Biomedical Named Entity Recognition
    Chen, Yanping
    Hu, Ying
    Li, Yijing
    Huang, Ruizhang
    Qin, Yongbin
    Wu, Yuefei
    Zheng, Qinghua
    Chen, Ping
    IEEE ACCESS, 2020, 8 : 214141 - 214152
  • [37] Improving biomedical Named Entity Recognition with additional external contexts
    Tho, Bui Duc
    Nguyen, Minh -Tien
    Le, Dung Tien
    Ying, Lin -Lung
    Inoue, Shumpei
    Nguyen, Tri-Thanh
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 156
  • [38] Biomedical Named Entity Recognition with Tri-training learning
    Cai, YueHong
    Cheng, XianYi
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 2178 - +
  • [39] HDCNN-CRF for Biomedical Text Named Entity Recognition
    Gao, Mingyuan
    Wei, Hao
    Chen, Fei
    Qu, Wen
    Lu, Mingyu
    PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 191 - 194
  • [40] BANNER: An executable survey of advances in biomedical named entity recognition
    Department of Computer Science and Engineering, Arizona State University, United States
    不详
    Pac. Symp. Biocomputing, PSB, (652-663):