Using natural language processing to identify opioid use disorder in electronic health record data

被引:8
|
作者
Singleton, Jade [1 ,2 ]
Li, Chengxi [3 ]
Akpunonu, Peter D. [4 ]
Abner, Erin L. [1 ]
Kucharska-Newton, Anna M. [1 ,5 ]
机构
[1] Univ Kentucky, Coll Publ Hlth, Dept Epidemiol, Lexington, KY 40536 USA
[2] Univ Kentucky Healthcare IT Dept, Business Intelligence, Lexington, KY 40517 USA
[3] Univ Kentucky, Coll Engn, Dept Comp Sci, Lexington, KY 40526 USA
[4] Univ Kentucky Hosp, Emergency Med & Med Toxicol, Lexington, KY 40536 USA
[5] Univ North Carolina Chapel Hill, Gillings Sch Global Publ Hlth, Dept Epidemiol, Chapel Hill, NC 27514 USA
关键词
Opioid use disorder; Natural language processing; Electronic healthcare records; ICD-10; ABUSE; PAIN;
D O I
10.1016/j.ijmedinf.2022.104963
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: As opioid prescriptions have risen, there has also been an increase in opioid use disorder (OUD) and its adverse outcomes. Accurate and complete epidemiologic surveillance of OUD, to inform prevention strategies, presents challenges. The objective of this study was to ascertain prevalence of OUD using two methods to identify OUD in electronic health records (EHR): applying natural language processing (NLP) for text mining of un-structured clinical notes and using ICD-10-CM diagnostic codes.Methods: Data were drawn from EHR records for hospital and emergency department patient visits to a large regional academic medical center from 2017 to 2019. International Classification of Disease, 10th Edition, Clinic Modification (ICD-10-CM) discharge codes were extracted for each visit. To develop the rule-based NLP algo-rithm, a stepwise process was used. First, a small sample of visits from 2017 was used to develop initial dic-tionaries. Next, EHR corresponding to 30,124 visits from 2018 were used to develop and evaluate the rule-based algorithm. A random sample of the results were manually reviewed to identify and address shortcomings in the algorithm, and to estimate sensitivity and specificity of the two methods of ascertainment. Last, the final algo-rithm was then applied to 29,212 visits from 2019 to estimate OUD prevalence. Results: While there was substantial overlap in the identified records (n = 1,381 [59.2 %]), overall n = 2,332 unique visits were identified. Of the total unique visits, 430 (18.4 %) were identified only by ICD-10-CM codes, and 521 (22.3 %) were identified only by NLP. The prevalence of visits with evidence of an OUD diagnosis in this sample, ascertained using only ICD-10-CM codes, was 1,811/29,212 (6.1 %). Including the additional 521 visits identified only by NLP, the estimated prevalence of OUD is 2,332/29,212 (7.9 %), an increase of 29.5 % compared to the use of ICD-10-CM codes alone. The estimated sensitivity and specificity of the NLP-based OUD classification were 81.8 % and 97.5 %, respectively, relative to gold-standard manual review by an expert addiction medicine physician.Conclusion: NLP-based algorithms can automate data extraction and identify evidence of opioid use disorder from unstructured electronic healthcare records. The most complete ascertainment of OUD in EHR was combined NLP with ICD-10-CM codes. NLP should be considered for epidemiological studies involving EHR data.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Statistical Natural Language Processing Can Accurately Identify Venous Thromboembolism (VTE) Events from Narrative Electronic Health Record Data
    Rochefort, Christian M.
    Verma, Aman D.
    Bucheridge, David L.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2014, 23 : 326 - 327
  • [22] Use of Natural Language Processing of Patient-Initiated Electronic Health Record Messages to Identify Patients With COVID-19 Infection
    Mermin-Bunnell, Kellen
    Zhu, Yuanda
    Hornback, Andrew
    Damhorst, Gregory
    Walker, Tiffany
    Robichaux, Chad
    Mathew, Lejy
    Jaquemet, Nour
    Peters, Kourtney
    Johnson, Theodore M.
    Wang, May Dongmei
    Anderson, Blake
    [J]. JAMA NETWORK OPEN, 2023, 6 (07) : E2322299
  • [23] USING ELECTRONIC HEALTH RECORD DATA FOR COHORT DISCOVERY AND PHENOTYPING OF DEVELOPMENTAL LANGUAGE DISORDER
    Nitin, Rachana
    Walters, Courtney
    Boorom, Olivia
    Margulis, Katherine
    Davis, Lea
    Below, Jennifer
    Camarata, Stephen
    Gordon, Reyna
    [J]. EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2019, 29 : S205 - S205
  • [24] Development and Use of Natural Language Processing for Identification of Distant Cancer Recurrence and Sites of Distant Recurrence Using Unstructured Electronic Health Record Data
    Karimi, Yasmin H.
    Blayney, Douglas W.
    Kurian, Allison W.
    Shen, Jeanne
    Yamashita, Rikiya
    Rubin, Daniel
    Banerjee, Imon
    [J]. JCO CLINICAL CANCER INFORMATICS, 2021, 5 : 469 - 478
  • [25] Natural Language Processing for Adjudication of Heart Failure in the Electronic Health Record
    Cunningham, Jonathan W.
    Singh, Pulkit
    Reeder, Christopher
    Lau, Emily S.
    Khurshid, Shaan
    Wang, Xin
    Ellinor, Patrick T.
    Lubitz, Steven A.
    Batra, Puneet
    Ho, Jennifer E.
    [J]. JACC-HEART FAILURE, 2023, 11 (07) : 852 - 854
  • [26] Improving the Efficiency of Clinical Trial Recruitment Using Electronic Health Record Data, Natural Language Processing, and Machine Learning
    Cai, Tianrun
    Cai, Fiona
    Dahal, Kumar
    Hong, Chuan
    Liao, Katherine
    [J]. ARTHRITIS & RHEUMATOLOGY, 2019, 71
  • [27] Developing Natural Language Processing to Extract Complementary and Integrative Health Information from Electronic Health Record Data
    Zhou, Huixue
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 474 - 475
  • [28] Natural Language Processing to Rapidly Identify Potential Signals for Adverse Events Using Electronic Medical Record Data: Example of Arthralgias and Vedolizumab
    Cai, Tianrun
    Kane-Wanger, Gwendolyn
    Bond, Allison
    Cagan, Andrew
    Murphy, Shawn N.
    Ananthakrishnan, Ashwin
    Liao, Katherine
    [J]. ARTHRITIS & RHEUMATOLOGY, 2016, 68
  • [29] Prediction of severe chest injury using natural language processing from the electronic health record
    Kulshrestha, Sujay
    Dligach, Dmitriy
    Joyce, Cara
    Baker, Marshall S.
    Gonzalez, Richard
    O'Rourke, Ann P.
    Glazer, Joshua M.
    Stey, Anne
    Kruser, Jacqueline M.
    Churpek, Matthew M.
    Afshar, Majid
    [J]. INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2021, 52 (02): : 205 - 212
  • [30] EHRSUPPORT: NATURAL LANGUAGE PROCESSING OF TEXT NOTES ON SOCIAL SUPPORT IN THE ELECTRONIC HEALTH RECORD AND DATA AVAILABILITY
    Kroenke, Candyce
    Aoki, Rhonda-lee
    Alexeeff, Stacey
    Cronkite, David J.
    Mammini, Lauren
    Jones, Salene M.
    Kushi, Lawrence H.
    Strayhorn-Carter, Shaila
    Mosen, David
    Carrell, David
    [J]. ANNALS OF BEHAVIORAL MEDICINE, 2023, 57 : S280 - S280