Assessing calibration and bias of a deployed machine learning malnutrition prediction model within a large healthcare system

被引:2
|
作者
Liou, Lathan [1 ]
Scott, Erick [2 ]
Parchure, Prathamesh [3 ]
Ouyang, Yuxia [4 ,5 ]
Egorova, Natalia [4 ]
Freeman, Robert [3 ]
Hofer, Ira S. [5 ,6 ,7 ]
Nadkarni, Girish N. [6 ,7 ]
Timsina, Prem [3 ]
Kia, Arash [3 ,5 ]
Levin, Matthew A. [3 ,5 ,6 ]
机构
[1] Icahn Sch Med Mt Sinai, New York, NY 10029 USA
[2] cStruct, La Jolla, CA USA
[3] Icahn Sch Med Mt Sinai, Inst Healthcare Delivery Sci, New York, NY USA
[4] Icahn Sch Med Mt Sinai, Dept Populat Hlth Sci & Policy, New York, NY USA
[5] Icahn Sch Med Mt Sinai, Dept Anesthesiol Perioperat & Pain Med, New York, NY USA
[6] Icahn Sch Med Mt Sinai, Charles Bronfman Inst Personalized Med, New York, NY USA
[7] Icahn Sch Med Mt Sinai, Dept Med, Div Data Driven & Digital Med D3M, New York, NY USA
来源
NPJ DIGITAL MEDICINE | 2024年 / 7卷 / 01期
关键词
REGRESSION; VALIDATION;
D O I
10.1038/s41746-024-01141-5
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Malnutrition is a frequently underdiagnosed condition leading to increased morbidity, mortality, and healthcare costs. The Mount Sinai Health System (MSHS) deployed a machine learning model (MUST-Plus) to detect malnutrition upon hospital admission. However, in diverse patient groups, a poorly calibrated model may lead to misdiagnosis, exacerbating health care disparities. We explored the model's calibration across different variables and methods to improve calibration. Data from adult patients admitted to five MSHS hospitals from January 1, 2021 - December 31, 2022, were analyzed. We compared MUST-Plus prediction to the registered dietitian's formal assessment. Hierarchical calibration was assessed and compared between the recalibration sample (N = 49,562) of patients admitted between January 1, 2021 - December 31, 2022, and the hold-out sample (N = 17,278) of patients admitted between January 1, 2023 - September 30, 2023. Statistical differences in calibration metrics were tested using bootstrapping with replacement. Before recalibration, the overall model calibration intercept was -1.17 (95% CI: -1.20, -1.14), slope was 1.37 (95% CI: 1.34, 1.40), and Brier score was 0.26 (95% CI: 0.25, 0.26). Both weak and moderate measures of calibration were significantly different between White and Black patients and between male and female patients. Logistic recalibration significantly improved calibration of the model across race and gender in the hold-out sample. The original MUST-Plus model showed significant differences in calibration between White vs. Black patients. It also overestimated malnutrition in females compared to males. Logistic recalibration effectively reduced miscalibration across all patient subgroups. Continual monitoring and timely recalibration can improve model accuracy.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Assessing the impacts of precipitation bias on distributed hydrologic model calibration and prediction accuracy
    Looper, Jonathan P.
    Vieux, Baxter E.
    Moreno, Maria A.
    JOURNAL OF HYDROLOGY, 2012, 418 : 110 - 122
  • [2] Assessing Machine Learning as a Tool to Explain Variance in Deployed Photovoltaic (PV) System Degradation
    Dunn, Jimmy C.
    Karin, Todd
    Shinn, Adam B.
    2021 IEEE 48TH PHOTOVOLTAIC SPECIALISTS CONFERENCE (PVSC), 2021, : 1606 - 1609
  • [3] Machine Learning Models for Early Prediction of Sepsis on Large Healthcare Datasets
    Camacho-Cogollo, Javier Enrique
    Bonet, Isis
    Gil, Bladimir
    Iadanza, Ernesto
    ELECTRONICS, 2022, 11 (09)
  • [4] Machine Learning for the Bias Correction of LDAPS Air Temperature Prediction Model
    Zhang, Geer
    PROCEEDINGS OF 2021 6TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES (ICMLT 2021), 2021, : 1 - 6
  • [5] Predictive model for assessing malnutrition in elderly hospitalized cancer patients: A machine learning approach
    Duan, Ran
    Li, Qingyuan
    Yuan, Qing Xiu
    Hu, Jiaxin
    Feng, Tong
    Ren, Tao
    GERIATRIC NURSING, 2024, 58 : 388 - 398
  • [6] Predicting Prenatal Depression and Assessing Model Bias Using Machine Learning Models
    Huang, Yongchao
    Alvernaz, Suzanne
    Kim, Sage J.
    Maki, Pauline
    Dai, Yang
    Bernabe, Beatriz Penalver
    BIOLOGICAL PSYCHIATRY: GLOBAL OPEN SCIENCE, 2024, 4 (06):
  • [7] Personalised healthcare model for monitoring and prediction of airpollution: machine learning approach
    Behal, Veerawali
    Singh, Ramandeep
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2021, 33 (03) : 425 - 449
  • [8] Intelligent assessment and prediction system for somatic fitness and healthcare using machine learning
    Liu, Hsiao-Man
    Huang, Chung-Chi
    Huang, Chung-Lin
    Ke, Yen-Ting
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) : 7957 - 7967
  • [9] Should Fairness be a Metric or a Model? A Model-based Framework for Assessing Bias in Machine Learning Pipelines
    Lalor, John P.
    Abbasi, Ahmed
    Oketch, Kezia
    Yang, Yi
    Forsgren, Nicole
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (04)
  • [10] Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models
    Nigenda, David
    Karnin, Zohar
    Zafar, Muhammad Bilal
    Ramesha, Raghu
    Tan, Alan
    Donini, Michele
    Kenthapadi, Krishnaram
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 3671 - 3681