Development and validation of clinical prediction models: Marginal differences between logistic regression, penalized maximum likelihood estimation, and genetic programming

被引:13
|
作者
Janssen, Kristel J. M. [1 ]
Siccama, Ivar [2 ]
Vergouwe, Yvonne [1 ]
Koffijberg, Hendrik [1 ]
Debray, T. P. A. [1 ]
Keijzer, Maarten [3 ]
Grobbee, Diederick E. [1 ]
Moons, Karel G. M. [1 ]
机构
[1] Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, NL-3508 AB Utrecht, Netherlands
[2] Erasmus MC, Dept Neurol, Rotterdam, Netherlands
[3] Pegasyst Benelux, Amsterdam, Netherlands
关键词
Prediction model; Logistic regression; Penalized maximum likelihood estimation; Genetic programming; DEEP-VEIN THROMBOSIS; NEURAL-NETWORKS; PRIMARY-CARE; SIMULATION; SELECTION; RISK; VARIABLES; CURVE;
D O I
10.1016/j.jclinepi.2011.08.011
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: Many prediction models are developed by multivariable logistic regression. However, there are several alternative methods to develop prediction models. We compared the accuracy of a model that predicts the presence of deep venous thrombosis (DVT) when developed by four different methods. Study Design and Setting: We used the data of 2,086 primary care patients suspected of DVT, which included 21 candidate predictors. The cohort was split into a derivation set (1,668 patients, 329 with DVT) and a validation set (418 patients, 86 with DVT). Also, 100 cross-validations were conducted in the full cohort. The models were developed by logistic regression, logistic regression with shrinkage by boot-strapping techniques, logistic regression with shrinkage by penalized maximum likelihood estimation, and genetic programming. The accuracy of the models was tested by assessing discrimination and calibration. Results: There were only marginal differences in the discrimination and calibration of the models in the validation set and cross-validations. Conclusion: The accuracy measures of the models developed by the four different methods were only slightly different, and the 95% confidence intervals were mostly overlapped. We have shown that models with good predictive accuracy are most likely developed by sensible modeling strategies rather than by complex development methods. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:404 / 412
页数:9
相关论文
共 16 条
  • [1] PENALIZED MAXIMUM-LIKELIHOOD ESTIMATION IN LOGISTIC-REGRESSION AND DISCRIMINATION
    ANDERSON, JA
    BLAIR, V
    [J]. BIOMETRIKA, 1982, 69 (01) : 123 - 136
  • [2] Logistic Regression Procedure Using Penalized Maximum Likelihood Estimation for Differential Item Functioning
    Lee, Sunbok
    [J]. JOURNAL OF EDUCATIONAL MEASUREMENT, 2020, 57 (03) : 443 - 457
  • [3] Maximum likelihood estimation in logistic regression models with a diverging number of covariates
    Liang, Hua
    Du, Pang
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2012, 6 : 1838 - 1846
  • [4] Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example
    Moons, KGM
    Donders, ART
    Steyerberg, EW
    Harrell, FE
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2004, 57 (12) : 1262 - 1270
  • [5] Constrained maximum likelihood estimation under logistic regression models based on case-control data
    Zhang B.
    [J]. Journal of Statistical Theory and Practice, 2008, 2 (3) : 369 - 383
  • [6] Bias in penalized quasi-likelihood estimation in random effects logistic regression models when the random effects are not normally distributed
    Austin, PC
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2005, 34 (03) : 549 - 565
  • [7] Maximum likelihood estimation in logistic regression models with a diverging number of covariates (vol 6, pg 1838, 2012)
    Liang, Hua
    Du, Pang
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2023, 17 (01): : 948 - 949
  • [8] Neural network and logistic regression diagnostic prediction models for giant cell arteritis: development and validation
    Ing, Edsel B.
    Miller, Neil R.
    Nguyen, Angeline
    Su, Wanhua
    Bursztyn, Lulu L. C. D.
    Poole, Meredith
    Kansal, Vinay
    Toren, Andrew
    Albreki, Dana
    Mouhanna, Jack G.
    Muladzanov, Alla
    Bernier, Mikael
    Gans, Mark
    Lee, Dongho
    Wendel, Colten
    Sheldon, Claire
    Shields, Marc
    Bellan, Lorne
    Lee-Wing, Matthew
    Mohadjer, Yasaman
    Nijhawan, Navdeep
    Tyndel, Felix
    Sundaram, Arun N. E.
    ten Hove, Martin W.
    Chen, John J.
    Rodriguez, Amadeo R.
    Hu, Angela
    Khalidi, Nader
    Ing, Royce
    Wong, Samuel W. K.
    Torun, Nurhan
    [J]. CLINICAL OPHTHALMOLOGY, 2019, 13 : 421 - 430
  • [9] Bias-corrected maximum semiparametric likelihood estimation under logistic regression models based on case-control data
    Zhang, B
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (01) : 108 - 124
  • [10] MARGIE - MARGINAL MAXIMUM-LIKELIHOOD ESTIMATION OF THE PARAMETERS OF THE ONE-PARAMETER, 2-PARAMETER, AND 3-PARAMETER LOGISTIC-MODELS
    MCKINLEY, RL
    [J]. BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 1985, 17 (04): : 513 - 514