Machine Learning Prediction of Kidney Stone Composition Using Electronic Health Record-Derived Features

被引:14
|
作者
Abraham, Abin [1 ,2 ]
Kavoussi, Nicholas L. [3 ]
Sui, Wilson [3 ]
Bejan, Cosmin [4 ]
Capra, John A. [1 ,2 ,5 ,6 ]
Hsi, Ryan [3 ]
机构
[1] Vanderbilt Univ, Dept Biol Sci, Vanderbilt Genet Inst, Nashville, TN USA
[2] Vanderbilt Univ, Struct Biol Ctr, Nashville, TN USA
[3] Vanderbilt Univ, Dept Urol, Med Ctr, 1211 Med Ctr Dr, Nashville, TN 37212 USA
[4] Vanderbilt Univ, Dept Biomed Informat, Med Ctr, Nashville, TN USA
[5] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA USA
[6] Univ Calif San Francisco, Dept Epidmiol & Biostat, San Francisco, CA USA
基金
美国国家卫生研究院;
关键词
kidney stone; 24H urine; machine learning; URINARY PH; RISK;
D O I
10.1089/end.2021.0211
中图分类号
R5 [内科学]; R69 [泌尿科学(泌尿生殖系疾病)];
学科分类号
1002 ; 100201 ;
摘要
Objectives: To assess the accuracy of machine learning models in predicting kidney stone composition using variables extracted from the electronic health record (EHR).Materials and Methods: We identified kidney stone patients (n = 1296) with both stone composition and 24-hour (24H) urine testing. We trained machine learning models (XGBoost [XG] and logistic regression [LR]) to predict stone composition using 24H urine data and EHR-derived demographic and comorbidity data. Models predicted either binary (calcium vs noncalcium stone) or multiclass (calcium oxalate, uric acid, hydroxyapatite, or other) stone types. We evaluated performance using area under the receiver operating curve (ROC-AUC) and accuracy and identified predictors for each task.Results: For discriminating binary stone composition, XG outperformed LR with higher accuracy (91% vs 71%) with ROC-AUC of 0.80 for both models. Top predictors used by these models were supersaturations of uric acid and calcium phosphate, and urinary ammonium. For multiclass classification, LR outperformed XG with higher accuracy (0.64 vs 0.56) and ROC-AUC (0.79 vs 0.59), and urine pH had the highest predictive utility. Overall, 24H urine analyte data contributed more to the models' predictions of stone composition than EHR-derived variables.Conclusion: Machine learning models can predict calcium stone composition. LR outperforms XG in multiclass stone classification. Demographic and comorbidity data are predictive of stone composition; however, including 24H urine data improves performance. Further optimization of performance could lead to earlier directed medical therapy for kidney stone patients.
引用
收藏
页码:243 / 250
页数:8
相关论文
共 50 条
  • [1] MACHINE LEARNING MODELS SU TO PREDICT KIDNEY STONE COMPOSITION AND 24-HOUR URINE ABNORMALITIES FROM ELECTRONIC HEALTH RECORD-DERIVED FEATURES
    Le, Chi
    Kavoussi, Nicholas
    Sui, Wilson
    Bejan, Cosmin
    Miller, Nicole
    His, Ryan
    [J]. JOURNAL OF UROLOGY, 2020, 203 : E718 - E718
  • [2] MACHINE LEARNING PREDICTION OF SYMPTOMATIC KIDNEY STONE RECURRENCE USING 24-HOUR URINE DATA AND ELECTRONIC HEALTH RECORD DERIVED FEATURES
    Doyle, Patrick
    Gong, Wu
    Hsi, Ryan
    Kavoussi, Nicholas
    [J]. JOURNAL OF UROLOGY, 2023, 209 : E922 - E922
  • [3] MACHINE LEARNING MODELS TO PREDICT 24-HOUR URINE ABNORMALITIES FROM ELECTRONIC HEALTH RECORD-DERIVED FEATURES
    Kavoussi, Nicholas
    Abraham, Abin
    Sui, Wilson
    Bejan, Cosmin
    Capra, John
    Hsi, Ryan
    [J]. JOURNAL OF UROLOGY, 2021, 206 : E957 - E957
  • [4] Weighting Primary Care Patient Panel Size: A Novel Electronic Health Record-Derived Measure Using Machine Learning
    Rajkomar, Alvin
    Yim, Joanne Wing Lan
    Grumbach, Kevin
    Parekh, Ami
    [J]. JMIR MEDICAL INFORMATICS, 2016, 4 (04) : 3 - 15
  • [5] Developing and Testing Electronic Health Record-Derived Caries Indices
    White, Joel M.
    Mertz, Elizabeth A.
    Mullins, Joanna M.
    Even, Joshua B.
    Guy, Trey
    Blaga, Elena
    Kottek, Aubri M.
    Kumar, Shwetha V.
    Bangar, Suhasini
    Vaderhobli, Ram
    Brandon, Ryan
    Santo, William
    Jenson, Larry
    Gansky, Stuart A.
    [J]. CARIES RESEARCH, 2019, 53 (06) : 650 - 658
  • [6] Prediction of Acute Kidney Injury in the Emergency Department Using Electronic Health Record Data and Machine Learning Methods
    Hinson, J. S.
    Martinez, D. A.
    Grams, M. S.
    Levin, S.
    [J]. ANNALS OF EMERGENCY MEDICINE, 2018, 72 (04) : S154 - S154
  • [7] Validity of electronic health record-derived quality measurement for performance monitoring
    Parsons, Amanda
    McCullough, Colleen
    Wang, Jason
    Shih, Sarah
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2012, 19 (04) : 604 - 609
  • [8] Accuracy of Electronic Health Record-Derived Data for the Identification of Incident ADHD
    Daley, Matthew F.
    Newton, Douglas A.
    Debar, Lynn
    Newcomer, Sophia R.
    Pieper, Lisa
    Boscarino, Joseph A.
    Toh, Sengwee
    Pawloski, Pamala
    Nordin, James D.
    Nakasato, Cynthia
    Herrinton, Lisa J.
    Bussing, Regina
    [J]. JOURNAL OF ATTENTION DISORDERS, 2017, 21 (05) : 416 - 425
  • [9] Preoperative Prediction of Postoperative Infections Using Machine Learning and Electronic Health Record Data
    Zhuang, Yaxu
    Dyas, Adam
    Meguid, Robert A.
    Henderson, William G.
    Bronsert, Michael
    Madsen, Helen
    Colborn, Kathryn L.
    [J]. ANNALS OF SURGERY, 2024, 279 (04) : 720 - 726
  • [10] Defining a Minimal Benchmark for Cardiovascular Risk Prediction Calculators in New England Electronic Health Record-Derived Cohort
    Zinzuwadia, Aniket N.
    Mineeva, Olga
    Li, Chunying
    Farukhi, Zareen
    Giulianini, Franco
    Cade, Brian E.
    Chen, Lin
    Karlson, Elizabeth W.
    Paynter, Nina P.
    Mora, Samia
    Demler, Olga V.
    [J]. CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2024, 17 (06):