Developing and validating a natural language processing algorithm to extract preoperative cannabis use status documentation from unstructured narrative clinical notes

被引:3
|
作者
Sajdeya, Ruba [1 ,2 ,7 ]
Mardini, Mamoun T. [3 ]
Tighe, Patrick J. [4 ]
Ison, Ronald L. [4 ]
Bai, Chen [3 ]
Jugl, Sebastian [5 ]
Hanzhi, Gao [6 ]
Zandbiglari, Kimia [5 ]
Adiba, Farzana, I [5 ]
Winterstein, Almut G. [5 ]
Pearson, Thomas A. [1 ,2 ]
Cook, Robert L. [1 ,2 ]
Rouhizadeh, Masoud [2 ,5 ]
机构
[1] Univ Florida, Coll Publ Hlth & Hlth Profess, Dept Epidemiol, Gainesville, FL USA
[2] Univ Florida, Coll Med, Gainesville, FL USA
[3] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
[4] Univ Florida, Coll Med, Dept Anesthesiol, Gainesville, FL USA
[5] Univ Florida, Ctr Drug Evaluat & Safety CoDES, Dept Pharmaceut Outcomes & Policy, Gainesville, FL USA
[6] Univ Florida, Dept Biostat, Gainesville, FL USA
[7] Univ Florida, Emerging Pathogens Inst, Coll Publ Hlth & Hlth Profess, Coll Med, 2055 Mowry Rd,POB 100009, Gainesville, FL 32610 USA
关键词
cannabis; perioperative outcomes; natural language processing; NLP; substance use; social determinants of health; HEALTH; IMPACT; CARE;
D O I
10.1093/jamia/ocad080
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective This study aimed to develop a natural language processing algorithm (NLP) using machine learning (ML) techniques to identify and classify documentation of preoperative cannabis use status. Materials and Methods We developed and applied a keyword search strategy to identify documentation of preoperative cannabis use status in clinical documentation within 60 days of surgery. We manually reviewed matching notes to classify each documentation into 8 different categories based on context, time, and certainty of cannabis use documentation. We applied 2 conventional ML and 3 deep learning models against manual annotation. We externally validated our model using the MIMIC-III dataset. Results The tested classifiers achieved classification results close to human performance with up to 93% and 94% precision and 95% recall of preoperative cannabis use status documentation. External validation showed consistent results with up to 94% precision and recall. Discussion Our NLP model successfully replicated human annotation of preoperative cannabis use documentation, providing a baseline framework for identifying and classifying documentation of cannabis use. We add to NLP methods applied in healthcare for clinical concept extraction and classification, mainly concerning social determinants of health and substance use. Our systematically developed lexicon provides a comprehensive knowledge-based resource covering a wide range of cannabis-related concepts for future NLP applications. Conclusion We demonstrated that documentation of preoperative cannabis use status could be accurately identified using an NLP algorithm. This approach can be employed to identify comparison groups based on cannabis exposure for growing research efforts aiming to guide cannabis-related clinical practices and policies.
引用
下载
收藏
页码:1418 / 1428
页数:11
相关论文
共 50 条
  • [21] Natural Language Processing Algorithm to Extract Multiple Myeloma Stage From Oncology Notes in the Veterans Affairs Healthcare System
    Goryachev, Sergey D.
    Yildirim, Cenk
    DuMontier, Clark
    La, Jennifer
    Dharne, Mayuri
    Gaziano, J. Michael
    Brophy, Mary T.
    Munshi, Nikhil C.
    Driver, Jane A.
    Do, Nhan V.
    Fillmore, Nathanael R.
    JCO CLINICAL CANCER INFORMATICS, 2024, 8
  • [22] Refinement of a Generalized Natural Language Processing Algorithm for the Identification of Clinical Terms from Free-Text Clinical Notes
    Nunes, Anthony P.
    Mortimer, Kathleen M.
    Loughlin, Jeanne
    Wang, Florence T.
    Dore, David D.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2015, 24 : 536 - 537
  • [23] Keyword Extraction Algorithm for Classifying Smoking Status from Unstructured Bilingual Electronic Health Records Based on Natural Language Processing
    Bae, Ye Seul
    Kim, Kyung Hwan
    Kim, Han Kyul
    Choi, Sae Won
    Ko, Taehoon
    Seo, Hee Hwa
    Lee, Hae-Young
    Jeon, Hyojin
    APPLIED SCIENCES-BASEL, 2021, 11 (19):
  • [24] Identification of Persistent Postoperative Opioid Use from Clinical Notes via a Natural Language Processing Engine
    Seng, Eri
    Mehdipour, Soraya
    Simpson, Sierra
    Gabriel, Rodney
    ANESTHESIA AND ANALGESIA, 2023, 136 : 649 - 651
  • [25] Clinical documentation of patient-reported medical cannabis use in primary care: Toward scalable extraction using natural language processing methods
    Carrell, David S.
    Cronkite, David J.
    Shea, Mary
    Oliver, Malia
    Luce, Casey
    Matson, Theresa E.
    Bobb, Jennifer F.
    Hsu, Clarissa
    Binswanger, Ingrid A.
    Browne, Kendall C.
    Saxon, Andrew J.
    McCormack, Jennifer
    Jelstrom, Eve
    Ghitza, Udi E.
    Campbell, Cynthia, I
    Bradley, Katharine A.
    Lapham, Gwen T.
    SUBSTANCE ABUSE, 2022, 43 (01) : 917 - 924
  • [26] Classifying early infant feeding status from clinical notes using natural language processing and machine learning
    Lemas, Dominick J.
    Du, Xinsong
    Rouhizadeh, Masoud
    Lewis, Braeden
    Frank, Simon
    Wright, Lauren
    Spirache, Alex
    Gonzalez, Lisa
    Cheves, Ryan
    Magalhaes, Marina
    Zapata, Ruben
    Reddy, Rahul
    Xu, Ke
    Parker, Leslie
    Harle, Chris
    Young, Bridget
    Louis-Jaques, Adetola
    Zhang, Bouri
    Thompson, Lindsay
    Hogan, William R.
    Modave, Francois
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [27] Developing a Classification Algorithm for Prediabetes Risk Detection From Home Care Nursing Notes Using Natural Language Processing
    Jeon, Eunjoo
    Kim, Aeri
    Lee, Jisoo
    Heo, Hyunsook
    Lee, Hana
    Woo, Kyungmi
    CIN-COMPUTERS INFORMATICS NURSING, 2023, 41 (07) : 539 - 547
  • [28] Building large-scale registries from unstructured clinical notes using a low-resource natural language processing pipeline
    Tavabi, Nazgol
    Pruneski, James
    Golchin, Shahriar
    Singh, Mallika
    Sanborn, Ryan
    Heyworth, Benton
    Landschaft, Assaf
    Kimia, Amir
    Kiapour, Ata
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 151
  • [29] Use of a natural language processing-based approach to extract depression symptom severity and suicide ideation from clinical notes to support depression research
    Boussios, Costas
    Palmon, Noa
    Jones, Cameron
    Momen, Safiyy
    Alves, Pedro
    Leavy, Michelle B.
    Curhan, Gary
    Gliklich, Richard
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 : 382 - 383
  • [30] Data Science and Natural Language Processing to Extract Information from Clinical Narratives
    Vydiswaran, V. G. Vinod
    Zhao, Xinyan
    Yu, Deahan
    CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 441 - 442