External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients

被引:21
|
作者
Eertink, Jakoba J. [1 ,2 ]
Heymans, Martijn W. [3 ,4 ]
Zwezerijnen, Gerben J. C. [2 ,5 ]
Zijlstra, Josee M. [1 ,2 ]
de Vet, Henrica C. W. [3 ,4 ]
Boellaard, Ronald [2 ,5 ]
机构
[1] Amsterdam UMC Locat Vrije Univ Amsterdam, Dept Hematol, De Boelelaan 1117, NL-1081 HV Amsterdam, Netherlands
[2] Canc Ctr Amsterdam, Imaging & Biomarkers, Amsterdam, Netherlands
[3] Amsterdam UMC Locat Vrije Univ Amsterdam, Epidemiol & Data Sci, Amsterdam, Netherlands
[4] Amsterdam Publ Hlth Res Inst, Methodol, Amsterdam, Netherlands
[5] Amsterdam UMC Locat Vrije Univ Amsterdam, Radiol & Nucl Med, Amsterdam, Netherlands
关键词
Internal validation; External validation; Model performance; CV-AUC;
D O I
10.1186/s13550-022-00931-w
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Aim Clinical prediction models need to be validated. In this study, we used simulation data to compare various internal and external validation approaches to validate models. Methods Data of 500 patients were simulated using distributions of metabolic tumor volume, standardized uptake value, the maximal distance between the largest lesion and another lesion, WHO performance status and age of 296 diffuse large B cell lymphoma patients. These data were used to predict progression after 2 years based on an existing logistic regression model. Using the simulated data, we applied cross-validation, bootstrapping and holdout (n = 100). We simulated new external datasets (n = 100, n = 200, n = 500) and simulated stage-specific external datasets (1), varied the cut-off for high-risk patients (2) and the false positive and false negative rates (3) and simulated a dataset with EARL2 characteristics (4). All internal and external simulations were repeated 100 times. Model performance was expressed as the cross-validated area under the curve (CV-AUC +/- SD) and calibration slope. Results The cross-validation (0.71 +/- 0.06) and holdout (0.70 +/- 0.07) resulted in comparable model performances, but the model had a higher uncertainty using a holdout set. Bootstrapping resulted in a CV-AUC of 0.67 +/- 0.02. The calibration slope was comparable for these internal validation approaches. Increasing the size of the test set resulted in more precise CV-AUC estimates and smaller SD for the calibration slope. For test datasets with different stages, the CV-AUC increased as Ann Arbor stages increased. As expected, changing the cut-off for high risk and false positive- and negative rates influenced the model performance, which is clearly shown by the low calibration slope. The EARL2 dataset resulted in similar model performance and precision, but calibration slope indicated overfitting. Conclusion In case of small datasets, it is not advisable to use a holdout or a very small external dataset with similar characteristics. A single small testing dataset suffers from a large uncertainty. Therefore, repeated CV using the full training dataset is preferred instead. Our simulations also demonstrated that it is important to consider the impact of differences in patient population between training and test data, which may ask for adjustment or stratification of relevant variables.
引用
收藏
页数:8
相关论文
共 24 条
  • [1] External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients
    Jakoba J. Eertink
    Martijn W. Heymans
    Gerben J. C. Zwezerijnen
    Josée M. Zijlstra
    Henrica C. W. de Vet
    Ronald Boellaard
    [J]. EJNMMI Research, 12
  • [2] A simulation study to compare cross-validation versus holdout or external testing to assess the performance of machine learning based clinical prediction rules
    Boellaard, R.
    Eertink, J. J.
    Lugtenburg, P. J.
    Zwezerijnen, G. J.
    Wiegers, S. E.
    de Vet, H. C.
    Zijlstra, J. M.
    [J]. EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2021, 48 (SUPPL 1) : S285 - S285
  • [3] Modest Performance of Heart Failure Clinical Prediction Models: A Systematic External Validation Study
    Upshaw, Jenica N.
    Nelson, Jason
    Wessler, Benjamin
    Koethe, Benjamin
    Lundquist, Christine
    Lutz, Jennifer
    van Klaveren, David
    Steyerberg, Ewout
    Kent, David M.
    [J]. CIRCULATION, 2018, 138
  • [4] Using interpretability approaches to update "black-box" clinical prediction models: an external validation study in nephrology
    Cruz, Harry Freitas da
    Pfahringer, Boris
    Martensen, Tom
    Schneider, Frederic
    Meyer, Alexander
    Boettinger, Erwin
    Schapranow, Matthieu-P.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111
  • [5] Development and external validation of deep learning clinical prediction models using variable-length time series data
    Bashiri, Fereshteh S.
    Carey, Kyle A.
    Martin, Jennie
    Koyner, Jay L.
    Edelson, Dana P.
    Gilbert, Emily R.
    Mayampurath, Anoop
    Afshar, Majid
    Churpek, Matthew M.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (06) : 1322 - 1330
  • [6] Clinical prediction models for mortality in patients with covid-19: external validation and individual participant data meta-analysis
    de Jong, Valentijn M. T.
    Rousset, Rebecca Z.
    Antonio-Villa, Neftali Eduardo
    Buenen, Arnoldus G.
    Calster, Ben Van
    Bello-Chavolla, Omar Yaxmehen
    Brunskill, Nigel J.
    Curcin, Vasa
    Damen, Johanna A. A.
    Fermin-Martinez, Carlos A.
    Fernandez-Chirino, Luisa
    Ferrari, Davide
    Free, Robert C.
    Gupta, Rishi K.
    Haldar, Pranabashis
    Hedberg, Pontus
    Korang, Steven Kwasi
    Kurstjens, Steef
    Kusters, Ron
    Major, Rupert W.
    Maxwell, Lauren
    Nair, Rajeshwari
    Naucler, Pontus
    Nguyen, Tri-Long
    Noursadeghi, Mahdad
    Rosa, Rossana
    Soares, Felipe
    Takada, Toshihiko
    van Royen, Florien S.
    van Smeden, Maarten
    Wynants, Laure
    Modrak, Martin
    Asselbergs, Folkert W.
    Linschoten, Marijke
    Moons, Karel G. M.
    Debray, Thomas P. A.
    [J]. BMJ-BRITISH MEDICAL JOURNAL, 2022, 378
  • [7] External validation and comparison of four cardiovascular risk prediction models with data from the Australian Diabetes, Obesity and Lifestyle study
    Albarqouni, Loai
    Doust, Jennifer A.
    Magliano, Dianna
    Barr, Elizabeth L. M.
    Shaw, Jonathan E.
    Glasziou, Paul P.
    [J]. MEDICAL JOURNAL OF AUSTRALIA, 2019, 210 (04) : 161 - 167
  • [8] Performance of an easy-to-use prediction model for renal patient survival: an external validation study using data from the ERA-EDTA Registry
    Hemke, Aline C.
    Heemskerk, Martin B. A.
    van Diepen, Merel
    Kramer, Anneke
    de Meester, Johan
    Heaf, James G.
    Abad Diez, Jose Maria
    Torres Guinea, Marta
    Finne, Patrik
    Brunet, Philippe
    Vikse, Bjorn E.
    Caskey, Fergus J.
    Traynor, Jamie P.
    Massy, Ziad A.
    Couchoud, Ceile
    Groothoff, Jaap W.
    Nordio, Maurizio
    Jager, Kitty J.
    Dekker, Friedo W.
    Hoitsma, Andries J.
    [J]. NEPHROLOGY DIALYSIS TRANSPLANTATION, 2018, 33 (10) : 1786 - 1793
  • [9] Prediction of incomplete primary debulking surgery in patients with advanced ovarian cancer: An external validation study of three models using computed tomography
    Rutten, Iris J. G.
    van de Laar, Rafli
    Kruitwagen, Roy F. P. M.
    Bakers, Frans C. H.
    Ploegmakers, Marieke J. M.
    Pappot, Teun W. F.
    Beets-Tan, Regina G. H.
    Massuger, Leon F. A. G.
    Zusterzeel, Petra L. M.
    Van Gorp, Toon
    [J]. GYNECOLOGIC ONCOLOGY, 2016, 140 (01) : 22 - 28
  • [10] Prediction of placenta accreta spectrum in patients with placenta previa using a clinical, US and MRI combined model: A retrospective study with external validation
    Maurea, Simone
    Verde, Francesco
    Romeo, Valeria
    Stanzione, Arnaldo
    Mainenti, Pier Paolo
    Raia, Giorgio
    Barbuto, Luigi
    Iacobellis, Francesca
    Santangelo, Fabrizia
    Sarno, Laura
    Migliorini, Sonia
    Petretta, Mario
    D'Armiento, Maria
    De Dominicis, Gianfranco
    Santangelo, Claudio
    Guida, Maurizio
    Romano, Luigia
    Brunetti, Arturo
    [J]. EUROPEAN JOURNAL OF RADIOLOGY, 2023, 168