A Large Language Model Screening Tool to Target Patients for Best Practice Alerts: Development and Validation

被引:7
|
作者
Savage, Thomas [1 ,3 ]
Wang, John [2 ]
Shieh, Lisa [1 ]
机构
[1] Stanford Univ, Div Hosp Med, Dept Med, Sch Med, Palo Alto, CA USA
[2] Stanford Univ, Dept Med, Divison Gastroenterol & Hepatol, Palo Alto, CA USA
[3] Stanford Univ, Dept Med, Div Hosp Med, 300 Pasteur Dr, Palo Alto, CA 94304 USA
关键词
large language models; language models; language model; EHR; health record; health records; quality improvement; Artificial Intelligence; Natural Language Processing;
D O I
10.2196/49886
中图分类号
R-058 [];
学科分类号
摘要
Background: Best Practice Alerts (BPAs) are alert messages to physicians in the electronic health record that are used to encourage appropriate use of health care resources. While these alerts are helpful in both improving care and reducing costs, BPAs are often broadly applied nonselectively across entire patient populations. The development of large language models (LLMs) provides an opportunity to selectively identify patients for BPAs.Objective: In this paper, we present an example case where an LLM screening tool is used to select patients appropriate for a BPA encouraging the prescription of deep vein thrombosis (DVT) anticoagulation prophylaxis. The artificial intelligence (AI) screening tool was developed to identify patients experiencing acute bleeding and exclude them from receiving a DVT prophylaxis BPA.Methods: Our AI screening tool used a BioMed-RoBERTa (Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach; AllenAI) model to perform classification of physician notes, identifying patients without active bleeding and thus appropriate for a thromboembolism prophylaxis BPA. The BioMed-RoBERTa model was fine-tuned using 500 history and physical notes of patients from the MIMIC-III (Medical Information Mart for Intensive Care) database who were not prescribed anticoagulation. A development set of 300 MIMIC patient notes was used to determine the model's hyperparameters, and a separate test set of 300 patient notes was used to evaluate the screening tool.Results: Our MIMIC-III test set population of 300 patients included 72 patients with bleeding (ie, were not appropriate for a DVT prophylaxis BPA) and 228 without bleeding who were appropriate for a DVT prophylaxis BPA. The AI screening tool achieved impressive accuracy with a precision-recall area under the curve of 0.82 (95% CI 0.75-0.89) and a receiver operator curve area under the curve of 0.89 (95% CI 0.84-0.94). The screening tool reduced the number of patients who would trigger an alert by 20% (240 instead of 300 alerts) and increased alert applicability by 14.8% (218 [90.8%] positive alerts from 240 total alerts instead of 228 [76%] positive alerts from 300 total alerts), compared to nonselectively sending alerts for all patients. Conclusions: These results show a proof of concept on how language models can be used as a screening tool for BPAs. We provide an example AI screening tool that uses a HIPAA (Health Insurance Portability and Accountability Act)-compliant BioMed-RoBERTa model deployed with minimal computing power. Larger models (eg, Generative Pre-trained Transformers-3, Generative Pre-trained Transformers-4, and Pathways Language Model) will exhibit superior performance but require data use agreements to be HIPAA compliant. We anticipate LLMs to revolutionize quality improvement in hospital medicine.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Development and validation of the Texas Best Management Practice Evaluation Tool (TBET)
    White, M. W.
    Harmel, R. D.
    Haney, R. L.
    JOURNAL OF SOIL AND WATER CONSERVATION, 2012, 67 (06) : 525 - 535
  • [2] VALIDATION OF A NEW LANGUAGE SCREENING TOOL FOR ACUTE STROKE PATIENTS: THE LANGUAGE SCREENING TEST (LAST)
    Flamand-Roze, C.
    Falissard, B.
    Roze, E.
    Maintigneux, L.
    Beziz, J.
    Chacon, A.
    Join-Lambert, C.
    Adams, D.
    Denier, C.
    EUROPEAN JOURNAL OF NEUROLOGY, 2011, 18 : 97 - 97
  • [3] Validation of a New Language Screening Tool for Patients With Acute Stroke The Language Screening Test (LAST)
    Flamand-Roze, Constance
    Falissard, Bruno
    Roze, Emmanuel
    Maintigneux, Lisa
    Beziz, Jonathan
    Chacon, Audrey
    Join-Lambert, Claire
    Adams, David
    Denier, Christian
    STROKE, 2011, 42 (05) : 1224 - 1229
  • [4] Development and validation of a nutrition screening tool for hospitalized cancer patients
    Kim, Ji-Yeon
    Wie, Gyung-Ah
    Cho, Yeong-Ah
    Kim, So-Young
    Kim, Soo-Min
    Son, Kum-Hee
    Park, Sang-Jae
    Nam, Byung-Ho
    Joung, Hyojee
    CLINICAL NUTRITION, 2011, 30 (06) : 724 - 729
  • [5] Development and Validation of a Simplified Financial Toxicity Screening Tool for Use in Clinical Practice
    Thom, Bridgette
    Tin, Amy L.
    Chino, Fumiko
    Vickers, Andrew J.
    Aviki, Emeline M.
    JCO ONCOLOGY PRACTICE, 2025, 21 (01)
  • [6] Pilot Study Determining Impact of Best Practices Alerts on Hydroxychloroquine Screening Practice Patterns
    Au, Adrian
    Parikh, Vishal
    Modi, Yasha
    Ehlers, Justis P.
    Schachat, Andrew
    Singh, Rishi P.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2016, 57 (12)
  • [7] Development and Validation of a Screening Tool for Bacteremia in Acute Burn Injury Patients
    Cooper, A.
    Walker, S. A. N.
    Elligsen, M.
    Lo, J.
    Lee, C.
    Walker, S. E.
    Palmay, L.
    Cartotto, R.
    Jeschke, M.
    CANADIAN JOURNAL OF HOSPITAL PHARMACY, 2016, 69 (01): : 82 - 83
  • [8] Language Screening Test in Five Minutes: Validation of a New Aphasia Screening - Tool for Acute Stroke Patients
    Flamand, Constance
    Falissard, Bruno
    Roze, Emmanuel
    Adams, David
    Denier, Christian
    NEUROLOGY, 2011, 76 (09) : A206 - A207
  • [9] Development and validation of psychiatric morbidity screening tool
    Sharma, Manoj Kumar
    Chaturvedi, S. K.
    INDIAN JOURNAL OF PSYCHIATRY, 2013, 55 (05) : S47 - S47
  • [10] Development and Evaluation of Best Practice Alerts: Methods to Optimize Care Quality and Clinician Communication
    Fry, Corey
    AACN ADVANCED CRITICAL CARE, 2021, 32 (04) : 468 - 472