Computing AIC for black-box models using generalized degrees of freedom: A comparison with cross-validation

被引:11
|
作者
Hauenstein, Severin [1 ]
Wood, Simon N. [2 ]
Dormann, Carsten F. [1 ]
机构
[1] Univ Freiburg, Dept Biometry & Environm Syst Anal, Tennenbacherstr 4, D-79106 Freiburg, Germany
[2] Univ Bristol, Sch Math, Bristol, Avon, England
关键词
Boosted regression trees; Data perturbation; Model complexity; Random forest; SELECTION; INFORMATION; PREDICTION;
D O I
10.1080/03610918.2017.1315728
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Generalized degrees of freedom (GDF), as defined by Ye (1998 JASA 93:120-131), represent the sensitivity of model fits to perturbations of the data. Such GDF can be computed for any statistical model, making it possible, in principle, to derive the effective number of parameters in machine-learning approaches and thus compute information-theoretical measures of fit. We compare GDF with cross-validation and find that the latter provides a less computer-intensive and more robust alternative. For Bernoulli-distributed data, GDF estimates were unstable and inconsistently sensitive to the number of data points perturbed simultaneously. Cross-validation, in contrast, performs well also for binary data, and for very different machine-learning approaches.
引用
收藏
页码:1382 / 1396
页数:15
相关论文
共 50 条
  • [1] COMPARISON OF SIMPLE AND COMPLEX-MODELS USING CROSS-VALIDATION
    COLLYER, CE
    [J]. BULLETIN OF THE PSYCHONOMIC SOCIETY, 1985, 23 (04) : 293 - 293
  • [2] A comparison of material flow strength models using Bayesian cross-validation
    Bernstein, Jason
    Schmidt, Kathleen
    Rivera, David
    Barton, Nathan
    Florando, Jeffrey
    Kupresanin, Ana
    [J]. COMPUTATIONAL MATERIALS SCIENCE, 2019, 169
  • [3] Using nonlinear black-box models in fault detection
    Zhang, QH
    [J]. PROCEEDINGS OF THE 35TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 1996, : 636 - 637
  • [4] Explaining Black-Box Models Using Interpretable Surrogates
    Kuttichira, Deepthi Praveenlal
    Gupta, Sunil
    Li, Cheng
    Rana, Santu
    Venkatesh, Svetha
    [J]. PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2019, 11670 : 3 - 15
  • [5] EFFICIENT GENERALIZED CROSS-VALIDATION FOR STATE-SPACE MODELS
    ANSLEY, CF
    KOHN, R
    [J]. BIOMETRIKA, 1987, 74 (01) : 139 - 148
  • [6] Cross-Validation in AMMI and GGE Models: A Comparison of Methods
    Hadasch, Steffen
    Forkman, Johannes
    Piepho, Hans-Peter
    [J]. CROP SCIENCE, 2017, 57 (01) : 264 - 274
  • [7] Generalized cross-validation for bandwidth selection of backfitting estimates in generalized additive models
    Kauermann, G
    Opsomer, JD
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2004, 13 (01) : 66 - 89
  • [8] Understanding biological timing using mechanistic and black-box models
    Dalchau, Neil
    [J]. NEW PHYTOLOGIST, 2012, 193 (04) : 852 - 858
  • [9] Using interpretability approaches to update "black-box" clinical prediction models: an external validation study in nephrology
    Cruz, Harry Freitas da
    Pfahringer, Boris
    Martensen, Tom
    Schneider, Frederic
    Meyer, Alexander
    Boettinger, Erwin
    Schapranow, Matthieu-P.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111
  • [10] Generalized cross-validation for wavelet shrinkage in nonparametric mixed effects models
    Lu, HHS
    Huang, SY
    Lin, FH
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (03) : 714 - 730