Information Extraction from Electronic Medical Records Using Multitask Recurrent Neural Network with Contextual Word Embedding

被引:26
|
作者
Yang, Jianliang [1 ]
Liu, Yuenan [1 ]
Qian, Minghui [1 ]
Guan, Chenghua [2 ]
Yuan, Xiangfei [2 ]
机构
[1] Renmin Univ China, Sch Informat Resource Management, 59 Zhongguancun Ave, Beijing 100872, Peoples R China
[2] Beijing Normal Univ, Sch Econ & Resource Management, 19 Xinjiekou Outer St, Beijing 100875, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 18期
关键词
clinical named entity recognition; information extraction; multitask model; long short-term memory; conditional random field;
D O I
10.3390/app9183658
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Clinical named entity recognition is an essential task for humans to analyze large-scale electronic medical records efficiently. Traditional rule-based solutions need considerable human effort to build rules and dictionaries; machine learning-based solutions need laborious feature engineering. For the moment, deep learning solutions like Long Short-term Memory with Conditional Random Field (LSTM-CRF) achieved considerable performance in many datasets. In this paper, we developed a multitask attention-based bidirectional LSTM-CRF (Att-biLSTM-CRF) model with pretrained Embeddings from Language Models (ELMo) in order to achieve better performance. In the multitask system, an additional task named entity discovery was designed to enhance the model's perception of unknown entities. Experiments were conducted on the 2010 Informatics for Integrating Biology & the Bedside/Veterans Affairs (I2B2/VA) dataset. Experimental results show that our model outperforms the state-of-the-art solution both on the single model and ensemble model. Our work proposes an approach to improve the recall in the clinical named entity recognition task based on the multitask mechanism.
引用
收藏
页数:16
相关论文
共 50 条
  • [11] EXTRACTION OF MEDICAL DATA FROM ELECTRONIC MEDICAL RECORDS USING NLP ALGORITHMS
    Gusev, Aleksandr V.
    Novitskiy, Roman E.
    Ivshin, Aleksandr A.
    Boldina, Juliia S.
    Shtykov, Aleksey S.
    Vasilev, Aleksey S.
    [J]. AD ALTA-JOURNAL OF INTERDISCIPLINARY RESEARCH, 2022, 12 (02): : 314 - 319
  • [12] Data-Driven Information Extraction from Chinese Electronic Medical Records
    Xu, Dong
    Zhang, Meizhuo
    Zhao, Tianwan
    Ge, Chen
    Gao, Weiguo
    Wei, Jia
    Zhu, Kenny Q.
    [J]. PLOS ONE, 2015, 10 (08):
  • [13] A network-based analysis of medical information extracted from electronic medical records
    Reategui, Ruth
    Ratte, Sylvie
    Bautista-Valarezo, Estefania
    Beltran-Valdivieso, J. F.
    [J]. 2020 XLVI LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2020), 2021, : 10 - 19
  • [14] Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records
    Beeksma, Merijn
    Verberne, Suzan
    van den Bosch, Antal
    Das, Enny
    Hendrickx, Iris
    Groenewoud, Stef
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)
  • [15] Comparison of Word Embeddings for Extraction from Medical Records
    Dudchenko, Aleksei
    Kopanitsa, Georgy
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2019, 16 (22)
  • [16] Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records
    Merijn Beeksma
    Suzan Verberne
    Antal van den Bosch
    Enny Das
    Iris Hendrickx
    Stef Groenewoud
    [J]. BMC Medical Informatics and Decision Making, 19
  • [17] Enhance Medical Sentiment Vectors through Document Embedding using Recurrent Neural Network
    Yousef, Rami N. M.
    Tiun, Sabrina
    Omar, Nazlia
    Alshari, Eissa M.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 372 - 378
  • [18] Enhance medical sentiment vectors through document embedding using recurrent neural network
    Yousef, Rami N.M.
    Tiun, Sabrina
    Omar, Nazlia
    Alshari, Eissa M.
    [J]. International Journal of Advanced Computer Science and Applications, 2020, 11 (04): : 372 - 378
  • [19] Development of Patient Information Extraction Method by Sequence Labeling using Electronic Medical Records
    Kushima, Muneo
    Matsuo, Ryosuke
    Ogawa, Taisuke
    Araki, Kenji
    Hasegawa, Yoshiyuki
    Nozue, Suguru
    Okazaki, Emi
    Koga, Hisayoshi
    [J]. 2020 IEEE 50TH INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC (ISMVL 2020), 2020, : 105 - 110
  • [20] An Automated Approach for Clinical Quantitative Information Extraction from Chinese Electronic Medical Records
    Liu, Shanshan
    Pan, Xiaoyi
    Chen, Boyu
    Gao, Dongfa
    Hao, Tianyong
    [J]. HEALTH INFORMATION SCIENCE (HIS 2018), 2018, 11148 : 98 - 109