Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0)

被引:97
|
作者
Jagannatha, Abhyuday [1 ]
Liu, Feifan [2 ]
Liu, Weisong [3 ,4 ]
Yu, Hong [1 ,3 ,4 ,5 ]
机构
[1] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA
[2] Univ Massachusetts, Med Sch, Dept Quantitat Hlth Sci & Radiol, Worcester, MA 01605 USA
[3] Univ Massachusetts, Dept Comp Sci, 220 Pawtucket St, Lowell, MA 01854 USA
[4] Univ Massachusetts, Med Sch, Dept Med, Worcester, MA 01605 USA
[5] Bedford VAMC, Bedford, MA 01730 USA
基金
美国国家卫生研究院;
关键词
SYSTEM; INFORMATION; PHARMACOVIGILANCE; PREDICTION; SAFETY; CORPUS; UMLS;
D O I
10.1007/s40264-018-0762-z
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
IntroductionThis work describes the Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0) corpus and provides an overview of the MADE 1.0 2018 challenge for extracting medication, indication, and adverse drug events (ADEs) from electronic health record (EHR) notes.ObjectiveThe goal of MADE is to provide a set of common evaluation tasks to assess the state of the art for natural language processing (NLP) systems applied to EHRs supporting drug safety surveillance and pharmacovigilance. We also provide benchmarks on the MADE dataset using the system submissions received in the MADE 2018 challenge.MethodsThe MADE 1.0 challenge has released an expert-annotated cohort of medication and ADE information comprising 1089 fully de-identified longitudinal EHR notes from 21 randomly selected patients with cancer at the University of Massachusetts Memorial Hospital. Using this cohort as a benchmark, the MADE 1.0 challenge designed three shared NLP tasks. The named entity recognition (NER) task identifies medications and their attributes (dosage, route, duration, and frequency), indications, ADEs, and severity. The relation identification (RI) task identifies relations between the named entities: medication-indication, medication-ADE, and attribute relations. The third shared task (NER-RI) evaluates NLP models that perform the NER and RI tasks jointly. In total, 11 teams from four countries participated in at least one of the three shared tasks, and 41 system submissions were received in total.ResultsThe best systems F-1 scores for NER, RI, and NER-RI were 0.82, 0.86, and 0.61, respectively. Ensemble classifiers using the team submissions improved the performance further, with an F-1 score of 0.85, 0.87, and 0.66 for the three tasks, respectively.ConclusionMADE results show that recent progress in NLP has led to remarkable improvements in NER and RI tasks for the clinical domain. However, some room for improvement remains, particularly in the NER-RI task.
引用
收藏
页码:99 / 111
页数:13
相关论文
共 50 条
  • [21] Prediction of severe chest injury using natural language processing from the electronic health record
    Kulshrestha, Sujay
    Dligach, Dmitriy
    Joyce, Cara
    Baker, Marshall S.
    Gonzalez, Richard
    O'Rourke, Ann P.
    Glazer, Joshua M.
    Stey, Anne
    Kruser, Jacqueline M.
    Churpek, Matthew M.
    Afshar, Majid
    INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2021, 52 (02): : 205 - 212
  • [22] Applying Natural Language Processing to Electronic Health Record Data-From Text to Triage
    Sun, Grace K.
    Ambrosy, Andrew P.
    JAMA NETWORK OPEN, 2024, 7 (11)
  • [23] Natural Language Processing to Identify Dementia and Mild Cognitive Impairment from Electronic Health Record
    Yang, M.
    Bhandari, A.
    Callahan, K.
    Kirkendall, E.
    Lenoir, K. M.
    Pajewski, N. M.
    Topaloglu, U.
    JOURNAL OF THE AMERICAN GERIATRICS SOCIETY, 2020, 70 : S160 - S160
  • [24] Extracting social determinants of health from inpatient electronic medical records using natural language processing
    Martin, Elliot A.
    D'Souza, Adam G.
    Saini, Vineet
    Tang, Karen
    Quan, Hude
    Eastwood, Cathy A.
    JOURNAL OF EPIDEMIOLOGY AND POPULATION HEALTH, 2024, 72 (06):
  • [25] Analysis of Primary Care Provider Electronic Health Record Notes for Discussions of Prediabetes Using Natural Language Processing Methods
    Tseng, Eva
    Schwartz, Jessica L.
    Rouhizadeh, Masoud
    Maruthur, Nisa M.
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2021,
  • [26] ANALYSIS OF PRIMARY CARE PROVIDER ELECTRONIC HEALTH RECORD NOTES FOR DISCUSSIONS OF PREDIABETES USING NATURAL LANGUAGE PROCESSING METHODS
    Tseng, Eva
    Schwartz, Jessica L.
    Rouhizadeh, Masoud
    Maruthur, Nisa
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2020, 35 (SUPPL 1) : S11 - S12
  • [27] Statistical Natural Language Processing Can Accurately Identify Venous Thromboembolism (VTE) Events from Narrative Electronic Health Record Data
    Rochefort, Christian M.
    Verma, Aman D.
    Bucheridge, David L.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2014, 23 : 326 - 327
  • [28] Developing Natural Language Processing to Extract Complementary and Integrative Health Information from Electronic Health Record Data
    Zhou, Huixue
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 474 - 475
  • [29] Extracting forced vital capacity from the electronic health record through natural language processing in rheumatoid arthritis-associated interstitial lung disease
    England, Bryant R.
    Roul, Punyasha
    Yang, Yangyuna
    Hershberger, Daniel
    Sayles, Harlan
    Rojas, Jorge
    Cannon, Grant W.
    Sauer, Brian C.
    Curtis, Jeffrey R.
    Baker, Joshua F.
    Mikuls, Ted R.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 (01)
  • [30] Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning
    Li, Fei
    Liu, Weisong
    Yu, Hong
    JMIR MEDICAL INFORMATICS, 2018, 6 (04) : 32 - 45