HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition

被引:45
|
作者
Weber, Leon [1 ,2 ]
Saenger, Mario [1 ]
Munchmeyer, Jannes [1 ,3 ]
Habibi, Maryam [1 ]
Leser, Ulf [1 ]
Akbik, Alan [1 ]
机构
[1] Humboldt Univ, Comp Sci Dept, D-10099 Berlin, Germany
[2] Helmholtz Assoc, Grp Math Modelling Cellular Proc, Max Delbruck Ctr Mol Med, D-13125 Berlin, Germany
[3] GFZ German Res Ctr Geosci, Sect Seismol, D-14473 Potsdam, Germany
关键词
D O I
10.1093/bioinformatics/btab042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Named entity recognition (NER) is an important step in biomedical information extraction pipelines. Tools for NER should be easy to use, cover multiple entity types, be highly accurate and be robust toward variations in text genre and style. We present HunFlair, a NER tagger fulfilling these requirements. HunFlair is integrated into the widely used NLP framework Flair, recognizes five biomedical entity types, reaches or overcomes state-of-the-art performance on a wide set of evaluation corpora, and is trained in a cross-corpus setting to avoid corpus-specific bias. Technically, it uses a character-level language model pretrained on roughly 24 million biomedical abstracts and three million full texts. It outperforms other off-the-shelf biomedical NER tools with an average gain of 7.26 pp over the next best tool in a cross-corpus setting and achieves on-par results with state-of-the-art research prototypes in in-corpus experiments. HunFlair can be installed with a single command and is applied with only four lines of code. Furthermore, it is accompanied by harmonized versions of 23 biomedical NER corpora.
引用
下载
收藏
页码:2792 / 2794
页数:3
相关论文
共 50 条
  • [1] HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition (vol 37, pg 2792, 2021)
    Weber, Leon
    Sanger, Mario
    Munchmeye, Jannes
    Habibi, Maryam
    Leser, Ulf
    Akbik, Alan
    BIOINFORMATICS, 2023, 39 (11)
  • [2] Named Entity Recognition and Relation Extraction: State-of-the-Art
    Nasar, Zara
    Jaffry, Syed Waqar
    Malik, Muhammad Kamran
    ACM COMPUTING SURVEYS, 2021, 54 (01)
  • [3] FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP
    Akbik, Alan
    Bergmann, Tanja
    Blythe, Duncan
    Rasul, Kashif
    Schweter, Stefan
    Vollgraf, Roland
    NAACL HLT 2019: THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE DEMONSTRATIONS SESSION, 2019, : 54 - 59
  • [4] Establishing a New State-of-the-Art for French Named Entity Recognition
    Suarez, Pedro Javier Ortiz
    Dupont, Yoann
    Muller, Benjamin
    Romaryi, Laurent
    Sagot, Benoit
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4631 - 4638
  • [5] Advanced grammars for state-of-the-art Named Entity Recognition (NER)
    Sayle, Roger
    Lowe, Daniel
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [6] HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools
    Saenger, Mario
    Garda, Samuele
    Wang, Xing David
    Weber-Genzel, Leon
    Droop, Pia
    Fuchs, Benedikt
    Akbik, Alan
    Leser, Ulf
    BIOINFORMATICS, 2024, 40 (10)
  • [7] Chinese named entity recognition: The state of the art
    Liu, Pan
    Guo, Yanming
    Wang, Fenglei
    Li, Guohui
    Neurocomputing, 2022, 473 : 37 - 53
  • [8] Chinese named entity recognition: The state of the art
    Liu, Pan
    Guo, Yanming
    Wang, Fenglei
    Li, Guohui
    NEUROCOMPUTING, 2022, 473 : 37 - 53
  • [9] A New State-of-The-Art Czech Named Entity Recognizer
    Strakova, Jana
    Straka, Milan
    Hajic, Jan
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 68 - 75
  • [10] Improving a state-of-the-art Named Entity Recognition system using the World Wide Web
    Farkas, Richard
    Szarvas, Gyorgy
    Ormandi, Robert
    ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS, 2007, 4597 : 163 - +