Evaluation of Natural Language Processing (NLP) systems to annotate drug product labeling with MedDRA terminology

被引:31
|
作者
Ly, Thomas [1 ]
Pamer, Carol [2 ]
Dang, Oanh [2 ]
Brajovic, Sonja [2 ]
Haider, Shahrukh [3 ]
Botsis, Taxiarchis [4 ,6 ]
Milward, David [5 ]
Winter, Andrew [5 ]
Lu, Susan [2 ]
Ball, Robert [2 ]
机构
[1] US FDA, CDER Off Biostat, Silver Spring, MD 20993 USA
[2] US FDA, CDER Off Surveillance & Epidemiol, Silver Spring, MD USA
[3] US FDA, CDER Off Translat Sci, Silver Spring, MD USA
[4] US FDA, CBER Off Biostat & Epidemiol, Silver Spring, MD USA
[5] Linguamatics Ltd, Cambridge, England
[6] Johns Hopkins Univ, Sidney Kimmel Comprehens Canc Ctr, Sch Med, Baltimore, MD USA
关键词
Pharmacovigilance; FDA; MedDRA; Labeling; Drug Safety; SURVEILLANCE; INFORMATION;
D O I
10.1016/j.jbi.2018.05.019
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Introduction: The FDA Adverse Event Reporting System (FAERS) is a primary data source for identifying unlabeled adverse events (AEs) in a drug or biologic drug product's postmarketing phase. Many AE reports must be reviewed by drug safety experts to identify unlabeled AEs, even if the reported AEs are previously identified, labeled AEs. Integrating the labeling status of drug product AEs into FAERS could increase report triage and review efficiency. Medical Dictionary for Regulatory Activities (MedDRA) is the standard for coding AE terms in FAERS cases. However, drug manufacturers are not required to use MedDRA to describe AEs in product labels. We hypothesized that natural language processing (NLP) tools could assist in automating the extraction and MedDRA mapping of AE terms in drug product labels. Materials and methods: We evaluated the performance of three NLP systems, (ETHER, 12E, MetaMap) for their ability to extract AE terms from drug labels and translate the terms to MedDRA Preferred Terms (PTs). Pharmacovigilance-based annotation guidelines for extracting AE terms from drug labels were developed for this study. We compared each system's output to MedDRA PT AE lists, manually mapped by FDA pharma-covigilance experts using the guidelines, for ten drug product labels known as the "gold standard AE list" (GSL) dataset. Strict time and configuration conditions were imposed in order to test each system's capabilities under conditions of no human intervention and minimal system configuration. Each NLP system's output was evaluated for precision, recall and F measure in comparison to the GSL. A qualitative error analysis (QEA) was conducted to categorize a random sample of each NLP system's false positive and false negative errors. Results: A total of 417, 278, and 250 false positive errors occurred in the ETHER, 12E, and MetaMap outputs, respectively. A total of 100, 80, and 187 false negative errors occurred in ETHER, 12E, and MetaMap outputs, respectively. Precision ranged from 64% to 77%, recall from 64% to 83% and F measure from 67% to 79%. 12E had the highest precision (77%), recall (83%) and F measure (79%). ETHER had the lowest precision (64%). MetaMap had the lowest recall (64%). The QEA found that the most prevalent false positive errors were context errors such as "Context error/General term", "Context error/Instructions or monitoring parameters", "Context error/Medical history preexisting condition underlying condition risk factor or contraindication", and "Context error/AE manifestations or secondary complication". The most prevalent false negative errors were in the "Incomplete or missed extraction" error category. Missing AE terms were typically due to long terms, or terms containing non-contiguous words which do not correspond exactly to MedDRA synonyms. MedDRA mapping errors were a minority of errors for ETHER and 12E but were the most prevalent false positive errors for MetaMap. Conclusions: The results demonstrate that it may be feasible to use NLP tools to extract and map AE terms to MedDRA PTs. However, the NLP tools we tested would need to be modified or reconfigured to lower the error rates to support their use in a regulatory setting. Tools specific for extracting AE terms from drug labels and mapping the terms to MedDRA PTs may need to be developed to support pharmacovigilance. Conducting research using additional NLP systems on a larger, diverse GSL would also be informative.
引用
收藏
页码:73 / 86
页数:14
相关论文
共 50 条
  • [1] NLP (Natural Language Processing) for NLP (Natural Language Programming)
    Mihalcea, R
    Liu, H
    Lieberman, H
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2006, 3878 : 319 - 330
  • [2] From NLP (Natural Language Processing) to MLP (Machine Language Processing)
    Institute for Applied Information Processing and Communications , Graz University of Technology, Austria
    不详
    不详
    Lect. Notes Comput. Sci., (256-269):
  • [3] From NLP (Natural Language Processing) to MLP (Machine Language Processing)
    Teufl, Peter
    Payer, Udo
    Lackner, Guenter
    COMPUTER NETWORK SECURITY, 2010, 6258 : 256 - +
  • [4] Natural Language Processing Pipeline to Annotate Bulgarian Legislative Data
    Koeva, Svetla
    Obreshkov, Nikola
    Yalamov, Martin
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6988 - 6994
  • [5] EVALUATION OF NATURAL-LANGUAGE PROCESSING SYSTEMS
    PETRICK, SR
    PROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1981, 18 : 202 - 204
  • [6] Natural Language Processing (NLP) Applied on Issue Trackers
    Ellmann, Mathias
    PROCEEDINGS OF THE 4TH ACM SIGSOFT INTERNATIONAL WORKSHOP ON NLP FOR SOFTWARE ENGINEERING (NL4SE '18), 2018, : 38 - 41
  • [8] A Systematic Literature Review on Natural Language Processing (NLP)
    Castanha, Jick
    Indrawati
    Pillai, Subhash K. B.
    Ramantoko, Gadang
    Widarmanti, Tri
    2022 INTERNATIONAL CONFERENCE ON ADVANCED CREATIVE NETWORKS AND INTELLIGENT SYSTEMS, ICACNIS, 2022, : 130 - 135
  • [9] Case Study: Natural Language Processing (NLP) with Open Data for Drug Repositioning in Glioblastoma Therapy
    Marxer, Curdin
    Rolke, Heiko
    Alfieri, Alex
    Halatsch, Marc-Eric
    2023 10TH IEEE SWISS CONFERENCE ON DATA SCIENCE, SDS, 2023, : 9 - 16
  • [10] Natural Language Processing approach to NLP Meta model automation
    Amirhosseini, Mohammad Hossein
    Kazemian, Hassan B.
    Ouazzane, Karim
    Chandler, Chris
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 186 - 193