Chemical Named Entity Recognition with Deep Contextualized Neural Embeddings

被引:1
|
作者
Awan, Zainab [1 ]
Kahlke, Tim [2 ]
Ralph, Peter J. [2 ]
Kennedy, Paul J. [1 ]
机构
[1] Univ Technol Sydney, Sch Comp Sci, Sydney, NSW, Australia
[2] Univ Technol Sydney, Climate Change Cluster, Sydney, NSW, Australia
关键词
Named Entity Recognition; Deep Learning; Word Representation; BiLSTM;
D O I
10.5220/0008163501350144
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chemical named entity recognition (ChemNER) is a preliminary step in chemical information extraction pipelines. ChemNER has been approached using rule-based, dictionary-based, and feature-engineered based machine learning, and more recently also deep learning based methods. Traditional word-embeddings, like word2vec and Glove, are inherently problematic because they ignore the context in which an entity appears. Contextualized embeddings called embedded language models (ELMo) have been recently introduced to represent contextual information of a word in its embedding space. In this work, we quantify the impact of contextualized embeddings for ChemNER by using Bi-LSTM-CRF (bidirectional long short term memory networks - conditional random fields) networks. We benchmarked our approach using four well-known corpora for chemical named entity recognition. Our results show that incorporation of ELMo results in statistically significant improvements in F1 score in all of the tested datasets.
引用
收藏
页码:135 / 144
页数:10
相关论文
共 50 条
  • [1] Pooled Contextualized Embeddings for Named Entity Recognition
    Akbik, Alan
    Bergmann, Tanja
    Vollgraf, Roland
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 724 - 728
  • [2] Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings
    Zhai, Zenan
    Dat Quoc Nguyen
    Akhondi, Saber A.
    Thorne, Camilo
    Druckenbrodt, Christian
    Cohn, Trevor
    Gregory, Michelle
    Verspoor, Karin
    [J]. SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2019), 2019, : 328 - 338
  • [3] Shahmukhi named entity recognition by using contextualized word embeddings
    Tehseen, Amina
    Ehsan, Toqeer
    Bin Liaqat, Hannan
    Kong, Xiangjie
    Ali, Amjad
    Al-Fuqaha, Ala
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
  • [4] A deep neural framework for named entity recognition with boosted word embeddings
    Goyal, Archana
    Gupta, Vishal
    Kumar, Manish
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 15533 - 15546
  • [5] A deep neural framework for named entity recognition with boosted word embeddings
    Archana Goyal
    Vishal Gupta
    Manish Kumar
    [J]. Multimedia Tools and Applications, 2024, 83 : 15533 - 15546
  • [6] Deep recurrent neural networks with word embeddings for Urdu named entity recognition
    Khan, Wahab
    Daud, Ali
    Alotaibi, Fahd
    Aljohani, Naif
    Arafat, Sachi
    [J]. ETRI JOURNAL, 2020, 42 (01) : 90 - 100
  • [7] Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
    Jiang, Min
    Sanger, Todd
    Liu, Xiong
    [J]. JMIR MEDICAL INFORMATICS, 2019, 7 (04) : 80 - 94
  • [8] Hierarchical Contextualized Representation for Named Entity Recognition
    Luo, Ying
    Xiao, Fengshun
    Zhao, Hai
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8441 - 8448
  • [9] Deep learning with word embeddings improves biomedical named entity recognition
    Habibi, Maryam
    Weber, Leon
    Neves, Mariana
    Wiegandt, David Luis
    Leser, Ulf
    [J]. BIOINFORMATICS, 2017, 33 (14) : I37 - I48
  • [10] Poincare Embeddings in the Task of Named Entity Recognition
    Munoz, David
    Perez, Fernando
    Pinto, David
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2020, PT II, 2020, 12469 : 193 - 204