Developing and validating a natural language processing algorithm to extract preoperative cannabis use status documentation from unstructured narrative clinical notes

被引:3
|
作者
Sajdeya, Ruba [1 ,2 ,7 ]
Mardini, Mamoun T. [3 ]
Tighe, Patrick J. [4 ]
Ison, Ronald L. [4 ]
Bai, Chen [3 ]
Jugl, Sebastian [5 ]
Hanzhi, Gao [6 ]
Zandbiglari, Kimia [5 ]
Adiba, Farzana, I [5 ]
Winterstein, Almut G. [5 ]
Pearson, Thomas A. [1 ,2 ]
Cook, Robert L. [1 ,2 ]
Rouhizadeh, Masoud [2 ,5 ]
机构
[1] Univ Florida, Coll Publ Hlth & Hlth Profess, Dept Epidemiol, Gainesville, FL USA
[2] Univ Florida, Coll Med, Gainesville, FL USA
[3] Univ Florida, Coll Med, Dept Hlth Outcomes & Biomed Informat, Gainesville, FL USA
[4] Univ Florida, Coll Med, Dept Anesthesiol, Gainesville, FL USA
[5] Univ Florida, Ctr Drug Evaluat & Safety CoDES, Dept Pharmaceut Outcomes & Policy, Gainesville, FL USA
[6] Univ Florida, Dept Biostat, Gainesville, FL USA
[7] Univ Florida, Emerging Pathogens Inst, Coll Publ Hlth & Hlth Profess, Coll Med, 2055 Mowry Rd,POB 100009, Gainesville, FL 32610 USA
关键词
cannabis; perioperative outcomes; natural language processing; NLP; substance use; social determinants of health; HEALTH; IMPACT; CARE;
D O I
10.1093/jamia/ocad080
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective This study aimed to develop a natural language processing algorithm (NLP) using machine learning (ML) techniques to identify and classify documentation of preoperative cannabis use status. Materials and Methods We developed and applied a keyword search strategy to identify documentation of preoperative cannabis use status in clinical documentation within 60 days of surgery. We manually reviewed matching notes to classify each documentation into 8 different categories based on context, time, and certainty of cannabis use documentation. We applied 2 conventional ML and 3 deep learning models against manual annotation. We externally validated our model using the MIMIC-III dataset. Results The tested classifiers achieved classification results close to human performance with up to 93% and 94% precision and 95% recall of preoperative cannabis use status documentation. External validation showed consistent results with up to 94% precision and recall. Discussion Our NLP model successfully replicated human annotation of preoperative cannabis use documentation, providing a baseline framework for identifying and classifying documentation of cannabis use. We add to NLP methods applied in healthcare for clinical concept extraction and classification, mainly concerning social determinants of health and substance use. Our systematically developed lexicon provides a comprehensive knowledge-based resource covering a wide range of cannabis-related concepts for future NLP applications. Conclusion We demonstrated that documentation of preoperative cannabis use status could be accurately identified using an NLP algorithm. This approach can be employed to identify comparison groups based on cannabis exposure for growing research efforts aiming to guide cannabis-related clinical practices and policies.
引用
收藏
页码:1418 / 1428
页数:11
相关论文
共 50 条
  • [1] Identification of Prediabetes Discussions in Unstructured Clinical Documentation: Validation of a Natural Language Processing Algorithm
    Schwartz, Jessica L.
    Tseng, Eva
    Maruthur, Nisa M.
    Rouhizadeh, Masoud
    [J]. JMIR MEDICAL INFORMATICS, 2022, 10 (02)
  • [2] Tobacco use status from clinical notes using Natural Language Processing and rule based algorithm
    Hegde, Harshad
    Shimpi, Neel
    Glurich, Ingrid
    Acharya, Amit
    [J]. TECHNOLOGY AND HEALTH CARE, 2018, 26 (03) : 445 - 456
  • [3] Natural language processing-driven state machines to extract social factors from unstructured clinical documentation
    Allen, Katie S.
    Hood, Dan R.
    Cummins, Jonathan
    Kasturi, Suranga
    Mendonca, Eneida A.
    Vest, Joshua R.
    [J]. JAMIA OPEN, 2023, 6 (02)
  • [4] Automated identification of wound information in clinical notes of patients with heart diseases: Developing and validating a natural language processing application
    Topaz, Maxim
    Lai, Kenneth
    Dowding, Dawn
    Lei, Victor J.
    Zisberg, Anna
    Bowles, Kathryn H.
    Zhou, Li
    [J]. INTERNATIONAL JOURNAL OF NURSING STUDIES, 2016, 64 : 25 - 31
  • [5] Using natural language processing methods to classify use status of dietary supplements in clinical notes
    Fan, Yadan
    Zhang, Rui
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2018, 18
  • [6] Using natural language processing methods to classify use status of dietary supplements in clinical notes
    Yadan Fan
    Rui Zhang
    [J]. BMC Medical Informatics and Decision Making, 18
  • [7] A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives
    Yang, Xi
    Yang, Hanyuan
    Lyu, Tianchen
    Yang, Shuang
    Guo, Yi
    Bian, Jiang
    Xu, Hua
    Wu, Yonghui
    [J]. 2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 521 - 522
  • [8] Using natural language processing to automatically extract cancer outcomes data from clinical notes
    Liptrot, Tom
    Karystianis, George
    Nenadic, Goran
    Keane, John
    Livsey, Jacqueline
    Barker-Hewitt, Matthew
    O'Hara, Catherine
    [J]. EUROPEAN JOURNAL OF CANCER CARE, 2015, 24 : 11 - 11
  • [9] FEASIBILITY OF USING NATURAL LANGUAGE PROCESSING TO EXTRACT CANCER PAIN SCORE FROM CLINICAL NOTES
    Naseri, Hossien
    [J]. RADIOTHERAPY AND ONCOLOGY, 2019, 139 : S65 - S65
  • [10] Use of natural language processing method to identify regional anesthesia from clinical notes
    Graham, Laura A.
    Illarmo, Samantha S.
    Wren, Sherry M.
    Odden, Michelle C.
    Mudumbai, Seshadri C.
    [J]. REGIONAL ANESTHESIA AND PAIN MEDICINE, 2024,