Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

被引:2
|
作者
Liou, Lathan [1 ]
Scott, Erick [2 ]
Parchure, Prathamesh [3 ]
Ouyang, Yuxia [4 ,5 ]
Egorova, Natalia [4 ]
Freeman, Robert [3 ]
Hofer, Ira S. [5 ,6 ,7 ]
Nadkarni, Girish N. [6 ,7 ]
Timsina, Prem [3 ]
Kia, Arash [3 ,5 ]
Levin, Matthew A. [3 ,5 ,6 ]
机构
[1] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[2] cStruct, La Jolla, CA USA
[3] Icahn Sch Med Mt Sinai, Inst Healthcare Delivery Sci, New York, NY USA
[4] Icahn Sch Med Mt Sinai, Dept Populat Hlth Sci & Policy, New York, NY USA
[5] Icahn Sch Med Mt Sinai, Dept Anesthesiol Perioperat & Pain Med, New York, NY USA
[6] Icahn Sch Med Mt Sinai, Charles Bronfman Inst Personalized Med, New York, NY USA
[7] Icahn Sch Med Mt Sinai, Dept Med, Div Data Driven & Digital Med D3M, New York, NY USA
来源
NPJ DIGITAL MEDICINE | 2024年 / 7卷 / 01期
关键词
REGRESSION; VALIDATION;
D O I
10.1038/s41746-024-01141-5
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Malnutrition is a frequently underdiagnosed condition leading to increased morbidity, mortality, and healthcare costs. The Mount Sinai Health System (MSHS) deployed a machine learning model (MUST-Plus) to detect malnutrition upon hospital admission. However, in diverse patient groups, a poorly calibrated model may lead to misdiagnosis, exacerbating health care disparities. We explored the model's calibration across different variables and methods to improve calibration. Data from adult patients admitted to five MSHS hospitals from January 1, 2021 - December 31, 2022, were analyzed. We compared MUST-Plus prediction to the registered dietitian's formal assessment. Hierarchical calibration was assessed and compared between the recalibration sample (N = 49,562) of patients admitted between January 1, 2021 - December 31, 2022, and the hold-out sample (N = 17,278) of patients admitted between January 1, 2023 - September 30, 2023. Statistical differences in calibration metrics were tested using bootstrapping with replacement. Before recalibration, the overall model calibration intercept was -1.17 (95% CI: -1.20, -1.14), slope was 1.37 (95% CI: 1.34, 1.40), and Brier score was 0.26 (95% CI: 0.25, 0.26). Both weak and moderate measures of calibration were significantly different between White and Black patients and between male and female patients. Logistic recalibration significantly improved calibration of the model across race and gender in the hold-out sample. The original MUST-Plus model showed significant differences in calibration between White vs. Black patients. It also overestimated malnutrition in females compared to males. Logistic recalibration effectively reduced miscalibration across all patient subgroups. Continual monitoring and timely recalibration can improve model accuracy.
引用
收藏
页数:7
相关论文
共 50 条
  • [11] Assessing seismic-like events prediction in model knits with unsupervised machine learning
    Douin, Adele
    Poincloux, Samuel
    Bruneton, Jean-Philippe
    Lechenault, Frederic
    EXTREME MECHANICS LETTERS, 2023, 58
  • [12] Machine Learning Models for GPU Error Prediction in a Large Scale HPC System
    Nie, Bin
    Xue, Ji
    Gupta, Saurabh
    Patel, Tirthak
    Engelmann, Christian
    Smirni, Evgenia
    Tiwari, Devesh
    2018 48TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2018, : 95 - 106
  • [13] A novel approach to assessing disparity in representativeness of clinical trial participants within a large midwestern healthcare system
    Rivelli, Anne
    Lefaiver, Cheryl
    Shields, Maureen
    Ozoani-Lohrer, Osondi
    Marek, Andy
    Hirschtick, Jana
    Fitzpatrick, Veronica
    CONTEMPORARY CLINICAL TRIALS COMMUNICATIONS, 2024, 38
  • [14] DIABETES PREDICTION MODEL AND DIAGNOSTIC SYSTEM BASED ON MACHINE LEARNING ALGORITHMS
    Yu, H. P.
    Li, F. Y.
    Xie, Y. Q.
    Guo, M.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2017, 121 : 53 - 53
  • [15] Machine learning based system performance prediction model for reactor control
    Zeng, Yuyun
    Liu, Jingquan
    Sun, Kaichao
    Hu, Lin-wen
    ANNALS OF NUCLEAR ENERGY, 2018, 113 : 270 - 278
  • [16] Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction
    Straw, Isabel
    Wu, Honghan
    BMJ HEALTH & CARE INFORMATICS, 2022, 29 (01)
  • [17] Enhancing healthcare facility resilience: utilizing machine learning model for airborne disease infection prediction
    Tang, Kangkang
    JOURNAL OF BUILDING PERFORMANCE SIMULATION, 2024, 17 (06) : 679 - 694
  • [18] The Prediction of Supercooled Large Drops by a Microphysics and a Machine Learning Model for the ICICLE Field Campaign
    Jensen, Anders A.
    Weeks, Courtney
    Xu, Mei
    Landolt, Scott
    Korolev, Alexei
    Wolde, Mengistu
    DiVitod, Stephanie
    WEATHER AND FORECASTING, 2023, 38 (07) : 1107 - 1124
  • [19] A Prototype Agent Based Model and Machine Learning Hybrid System for Healthcare Decision Support
    Laskowski, Marek
    INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS, 2011, 2 (04) : 67 - 90
  • [20] Building a machine learning surrogate model for wildfire activities within a global Earth system model
    Zhu, Qing
    Li, Fa
    Riley, William J.
    Xu, Li
    Zhao, Lei
    Yuan, Kunxiaojia
    Wu, Huayi
    Gong, Jianya
    Randerson, James
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2022, 15 (05) : 1899 - 1911