Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife

被引:0
|
作者
Wager, Stefan [1 ]
Hastie, Trevor [1 ]
Efron, Bradley [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
bagging; jackknife methods; Monte Carlo noise; variance estimation; BIAS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the variability of predictions made by bagged learners and random forests, and show how to estimate standard errors for these methods. Our work builds on variance estimates for bagging proposed by Efron (1992, 2013) that are based on the jackknife and the infinitesimal jackknife (IJ). In practice, bagged predictors are computed using a finite number B of bootstrap replicates, and working with a large B can be computationally expensive. Direct applications of jackknife and IJ estimators to bagging require B = Theta (n(1.5)) bootstrap replicates to converge, where n is the size of the training set. We propose improved versions that only require B = Theta (n) replicates. Moreover, we show that the IJ estimator requires 1.7 times less bootstrap replicates than the jackknife to achieve a given accuracy. Finally, we study the sampling distributions of the jackknife and IJ variance estimates themselves. We illustrate our findings with multiple experiments and simulation studies.
引用
收藏
页码:1625 / 1651
页数:27
相关论文
共 50 条
  • [41] A TRIMMED JACKKNIFE
    HINKLEY, D
    WANG, HL
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (03): : 347 - 356
  • [42] Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data PointsWithout Refitting
    Sattigeri, Prasanna
    Ghosh, Soumya
    Padhi, Inkit
    Dognin, Pierre
    Varshney, Kush R.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [43] Random polytopes and the Efron-Stein jackknife inequality
    Reitzner, M
    [J]. ANNALS OF PROBABILITY, 2003, 31 (04): : 2136 - 2166
  • [44] UNBALANCED JACKKNIFE
    MILLER, RG
    [J]. ANNALS OF STATISTICS, 1974, 2 (05): : 880 - 891
  • [45] JACKKNIFE CONFIDENCE-LIMITS USING STUDENT T-APPROXIMATIONS
    HINKLEY, DV
    [J]. BIOMETRIKA, 1977, 64 (01) : 21 - 28
  • [46] Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting
    Sattigeri, Prasanna
    Ghosh, Soumya
    Padhi, Inkit
    Dognin, Pierre
    Varshney, Kush R.
    [J]. Advances in Neural Information Processing Systems, 2022, 35
  • [47] JACKKNIFE-BASED ESTIMATORS AND CONFIDENCE REGIONS IN NONLINEAR REGRESSION.
    New York Univ, New York, NY, USA, New York Univ, New York, NY, USA
    [J]. Technometrics, 1986, 2 (103-112):
  • [48] JACKKNIFE - A GENERAL-PURPOSE PACKAGE FOR GENERATING MULTIVARIATE JACKKNIFE ANALYSES
    BALLOUN, JL
    OUMLIL, AB
    [J]. BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 1986, 18 (01): : 47 - 49
  • [49] A Jackknife Estimator of Variance for a Random Tessellated Stratified Sampling Design
    Magnussen, Steen
    Nord-Larsen, Thomas
    [J]. FOREST SCIENCE, 2019, 65 (05) : 543 - 547
  • [50] THE JACKKNIFE ESTIMATE OF VARIANCE
    EFRON, B
    STEIN, C
    [J]. ANNALS OF STATISTICS, 1981, 9 (03): : 586 - 596