Random Forest Prediction Intervals

被引:56
|
作者
Zhang, Haozhe [1 ]
Zimmerman, Joshua [1 ]
Nettleton, Dan [1 ]
Nordman, Daniel J. [1 ]
机构
[1] Iowa State Univ, Dept Stat, 2438 Osborn Dr, Ames, IA 50011 USA
来源
AMERICAN STATISTICIAN | 2020年 / 74卷 / 04期
关键词
Conformal inference; Coverage rate; Interval width; Out-of-bag prediction errors; Quantile regression forests; JACKKNIFE;
D O I
10.1080/00031305.2019.1585288
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Random forests are among the most popular machine learning techniques for prediction problems. When using random forests to predict a quantitative response, an important but often overlooked challenge is the determination of prediction intervals that will contain an unobserved response value with a specified probability. We propose new random forest prediction intervals that are based on the empirical distribution of out-of-bag prediction errors. These intervals can be obtained as a by-product of a single random forest. Under regularity conditions, we prove that the proposed intervals have asymptotically correct coverage rates. Simulation studies and analysis of 60 real datasets are used to compare the finite-sample properties of the proposed intervals with quantile regression forests and recently proposed split conformal intervals. The results indicate that intervals constructed with our proposed method tend to be narrower than those of competing methods while still maintaining marginal coverage rates approximately equal to nominal levels.
引用
收藏
页码:392 / 406
页数:15
相关论文
共 50 条
  • [31] Evaluation of Random Forest in Crime Prediction: Comparing Three-Layered Random Forest and Logistic Regression
    Oh, Gyeongseok
    Song, Juyoung
    Park, Hyoungah
    Na, Chongmin
    [J]. DEVIANT BEHAVIOR, 2022, 43 (09) : 1036 - 1049
  • [32] Exact prediction intervals for exponential lifetime based on random sample size
    Sultan, K. S.
    Abd Ellah, A. H.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2006, 83 (12) : 867 - 878
  • [33] Generalized prediction intervals for treatment effects in random-effects models
    Al-Sarraj, Razaw
    von Bromssen, Claudia
    Forkman, Johannes
    [J]. BIOMETRICAL JOURNAL, 2019, 61 (05) : 1242 - 1257
  • [34] Prediction intervals for generalized-order statistics with random sample size
    Basiri, Elham
    Ahmadi, Jafar
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (09) : 1725 - 1741
  • [35] RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests
    Alaku, Cansu
    Larocque, Denis
    Labbe, Aurelie
    [J]. R JOURNAL, 2022, 14 (01): : 300 - 319
  • [36] A Unified Framework for Random Forest Prediction Error Estimation
    Lu, Benjamin
    Hardin, Johanna
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [37] ESG score prediction through random forest algorithm
    Valeria D’Amato
    Rita D’Ecclesia
    Susanna Levantesi
    [J]. Computational Management Science, 2022, 19 : 347 - 373
  • [38] RAINFALL PREDICTION USING RANDOM FOREST ALGORITHM TECHNIQUE
    Srinivasan, S.
    Rani, P. Shobha
    Malini
    Mahitha
    Surekha, Vema Lakshmi
    [J]. INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (02) : 4503 - 4509
  • [39] Heterogeneity Aware Random Forest for Drug Sensitivity Prediction
    Rahman, Raziur
    Matlock, Kevin
    Ghosh, Souparno
    Pal, Ranadip
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [40] Prediction of Consumer Behaviour using Random Forest Algorithm
    Valecha, Harsh
    Varma, Aparna
    Khare, Ishita
    Sachdeva, Aakash
    Goyal, Mukta
    [J]. 2018 5TH IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (UPCON), 2018, : 653 - 658