Cascaded classifiers for confidence-based chemical named entity recognition

被引:49
|
作者
Corbett, Peter [1 ]
Copestake, Ann [2 ]
机构
[1] Univ Cambridge, Chem Lab, Unilever Ctr Mol Sci Informat, Cambridge CB2 1EW, England
[2] Univ Cambridge, Comp Lab, Cambridge CB3 0FD, England
基金
英国工程与自然科学研究理事会;
关键词
Chemistry Paper; Mean Average Precision; Entity Recognition; Potential Entity; PubMed Abstract;
D O I
10.1186/1471-2105-9-S11-S4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Chemical named entities represent an important facet of biomedical text. Results: We have developed a system to use character-based n-grams, Maximum Entropy Markov Models and rescoring to recognise chemical names and other such entities, and to make confidence estimates for the extracted entities. An adjustable threshold allows the system to be tuned to high precision or high recall. At a threshold set for balanced precision and recall, we were able to extract named entities at an F score of 80.7% from chemistry papers and 83.2% from PubMed abstracts. Furthermore, we were able to achieve 57.6% and 60.3% recall at 95% precision, and 58.9% and 49.1% precision at 90% recall. Conclusion: These results show that chemical named entities can be extracted with good performance, and that the properties of the extraction can be tuned to suit the demands of the task.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Cascaded classifiers for confidence-based chemical named entity recognition
    Peter Corbett
    Ann Copestake
    BMC Bioinformatics, 9
  • [2] A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition
    Xiong, Limao
    Zhou, Jie
    Zhu, Qunxi
    Wang, Xiao
    Wu, Yuanbin
    Zhang, Qi
    Gui, Tao
    Huang, Xuanjing
    Ma, Jin
    Shan, Ying
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1375 - 1386
  • [3] Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning
    Zhou, Kang
    Li, Yuepei
    Li, Qi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7198 - 7211
  • [4] Named Entity Recognition among Chinese MicroBlog Based on Cascaded CRF
    Xing, Yixue
    Zhu, Yonghua
    Zhang, Ke
    Liu, Shenkai
    Zhou, Jin
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 28 - 34
  • [5] Chinese Named Entity Recognition Based on Cascaded Conditional Random Fields
    Tan, Weixuan
    Kong, Fang
    Ni, Ji
    Zhou, Guodong
    11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 465 - 471
  • [6] Chinese Chemical Named Entity Recognition Based on Morpheme
    Wang, Guirong
    Xia, Bo
    Xiao, Ye
    Rao, Gaoqi
    Xun, Endong
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 247 - 252
  • [7] Based Cascaded Conditional Random Fields Model for Chinese Named Entity Recognition
    Zhang Suxiang
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1574 - 1578
  • [8] A Confidence-based Entity Resolution Approach with Incomplete Information
    Gu, Qi
    Zhang, Yan
    Cao, Jian
    Xu, Guandong
    Cuzzocrea, Alfredo
    2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2014, : 97 - 103
  • [9] Named entity recognition in Chinese medical records based on cascaded conditional random field
    College of Communication Engineering, Jilin University, Changchun
    130012, China
    不详
    130032, China
    不详
    AB
    T9S3A3, Canada
    Jilin Daxue Xuebao (Gongxueban), 6 (1843-1848):
  • [10] Discriminative Named Entity Recognition of Speech Data using Speech Recognition Confidence
    Sudoh, Katsuhito
    Tsukada, Hajime
    Isozaki, Hideki
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 337 - 340