Random Forest Prediction Intervals

被引:57
|
作者
Zhang, Haozhe [1 ]
Zimmerman, Joshua [1 ]
Nettleton, Dan [1 ]
Nordman, Daniel J. [1 ]
机构
[1] Iowa State Univ, Dept Stat, 2438 Osborn Dr, Ames, IA 50011 USA
来源
AMERICAN STATISTICIAN | 2020年 / 74卷 / 04期
关键词
Conformal inference; Coverage rate; Interval width; Out-of-bag prediction errors; Quantile regression forests; JACKKNIFE;
D O I
10.1080/00031305.2019.1585288
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Random forests are among the most popular machine learning techniques for prediction problems. When using random forests to predict a quantitative response, an important but often overlooked challenge is the determination of prediction intervals that will contain an unobserved response value with a specified probability. We propose new random forest prediction intervals that are based on the empirical distribution of out-of-bag prediction errors. These intervals can be obtained as a by-product of a single random forest. Under regularity conditions, we prove that the proposed intervals have asymptotically correct coverage rates. Simulation studies and analysis of 60 real datasets are used to compare the finite-sample properties of the proposed intervals with quantile regression forests and recently proposed split conformal intervals. The results indicate that intervals constructed with our proposed method tend to be narrower than those of competing methods while still maintaining marginal coverage rates approximately equal to nominal levels.
引用
收藏
页码:392 / 406
页数:15
相关论文
共 50 条
  • [1] Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
    Wang, Yan
    Wu, Huaiqing
    Nettleton, Dan
    [J]. arXiv, 2023,
  • [2] Stability of Random Forests and Coverage of Random-Forest Prediction Intervals
    Wang, Yan
    Wu, Huaiqing
    Nettleton, Dan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Prediction intervals with random forests
    Roy, Marie-Helene
    Larocque, Denis
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (01) : 205 - 229
  • [4] Prediction Intervals via Support Vector-quantile Regression Random Forest Hybrid
    Ravi, Vadlamani
    Tejasviram, Vadali
    Sharma, Anurag
    Khansama, Rashmi Ranjan
    [J]. COMPUTE'17: PROCEEDINGS OF THE 10TH ANNUAL ACM INDIA COMPUTE CONFERENCE, 2017, : 109 - 113
  • [5] Confidence intervals for the random forest generalization error
    Marques, F. Paulo C.
    [J]. PATTERN RECOGNITION LETTERS, 2022, 158 : 171 - 175
  • [6] Prediction intervals for integrals of Gaussian random fields
    De Oliveira, Victor
    Kone, Bazoumana
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 83 : 37 - 51
  • [7] Coverage probability of prediction intervals for discrete random variables
    Wang, Hsiuying
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 53 (01) : 17 - 26
  • [8] PREDICTION INTERVALS FOR THE RANDOM INTERCEPT LINEAR-MODEL
    JEYARATNAM, S
    PANCHAPAKESAN, S
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1988, 17 (09) : 3067 - 3073
  • [9] Prediction intervals for general balanced linear random models
    Lin, T. Y.
    Liao, C. T.
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2008, 138 (10) : 3164 - 3175
  • [10] Random Forest for Breast Cancer Prediction
    Octaviani, T. L.
    Rustam, Z.
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL SYMPOSIUM ON CURRENT PROGRESS IN MATHEMATICS AND SCIENCES (ISCPMS2018), 2019, 2168