A comparison of resampling and recursive partitioning methods in random forest for estimating the asymptotic variance using the infinitesimal jackknife

被引:3
|
作者
Brokampa, Cole [1 ]
Rao, M. B. [2 ]
Ryan, Patrick [1 ]
Jandarov, Roman [2 ]
机构
[1] Cincinnati Childrens Hosp Med Ctr, Div Biostat & Epidemiol, Cincinnati, OH 45229 USA
[2] Univ Cincinnati, Dept Environm Hlth, Cincinnati, OH 45220 USA
来源
STAT | 2017年 / 6卷 / 01期
关键词
conditional inference tree; infinitesimal jackknife; prediction variance; random forest;
D O I
10.1002/sta4.162
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The infinitesimal jackknife (IJ) has recently been applied to the random forest to estimate its prediction variance. These theorems were verified under a traditional random forest framework that uses classification and regression trees and bootstrap resampling. However, random forests using conditional inference trees and subsampling have been found to be not prone to variable selection bias. Here, we conduct simulation experiments using a novel approach to explore the applicability of the IJ to random forests using variations on the resampling method and base learner. Test data points were simulated and each trained using random forest on one hundred simulated training data sets using different combinations of resampling and base learners. Using conditional inference trees instead of traditional classification and regression trees as well as using subsampling instead of bootstrap sampling resulted in a much more accurate estimation of prediction variance when using the IJ. The random forest variations here have been incorporated into an open-source software package for the R programming language. Copyright (c) 2017 John Wiley & Sons, Ltd.
引用
收藏
页码:360 / 372
页数:13
相关论文
共 13 条
  • [1] A comparison of resampling and recursive partitioning methods in random forest for estimating the asymptotic variance using the infinitesimal jackknife
    Brokamp, Cole
    Rao, M.B.
    Ryan, Patrick
    Jandarov, Roman
    [J]. arXiv, 2017,
  • [2] Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods
    Lu, Min
    Sadiq, Saad
    Feaster, Daniel J.
    Ishwaran, Hemant
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (01) : 209 - 219
  • [3] INCOMPLETE 2-WAY RANDOM MODELS - A NUMERICAL COMPARISON OF 3 METHODS FOR ESTIMATING VARIANCE-COMPONENTS
    SEEGER, P
    LINDQVIST, B
    RONNINGEN, K
    [J]. ACTA AGRICULTURAE SCANDINAVICA, 1981, 31 (02): : 132 - 138
  • [4] Integrating Airborne Hyperspectral, Topographic, and Soil Data for Estimating Pasture Quality Using Recursive Feature Elimination with Random Forest Regression
    Pullanagari, Rajasheker R.
    Kereszturi, Gabor
    Yule, Ian
    [J]. REMOTE SENSING, 2018, 10 (07)
  • [5] A Comparison of Multi-Label Feature Selection Methods Using the Random Forest Paradigm
    Gharroudi, Ouadie
    Elghazel, Haytham
    Aussem, Alex
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2014, 2014, 8436 : 95 - 106
  • [6] Comparative role of various methods of estimating between study variance for meta-analysis using random effect method
    Pathak, Mona
    Dwivedi, Sada Nand
    Thakur, Bhaskar
    [J]. CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2020, 8 (01): : 185 - 189
  • [7] Comparison and Evaluation of Three Methods for Estimating Forest above Ground Biomass Using TM and GLAS Data
    Liu, Kaili
    Wang, Jindi
    Zeng, Weisheng
    Song, Jinling
    [J]. REMOTE SENSING, 2017, 9 (04)
  • [8] COMPARISON OF THREE MODELING METHODS FOR ESTIMATING FOREST BIOMASS USING TM, GLAS AND FIELD MEASUREMENT DATA
    Liu, Kaili
    Wang, Jindi
    WeishengZeng
    Song, Jinling
    [J]. 2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 5774 - 5777
  • [9] Indicator element selection and geochemical anomaly mapping using recursive feature elimination and random forest methods in the Jingdezhen region of Jiangxi Province, South China
    Wang, Chengbin
    Pan, Yipeng
    Chen, Jianguo
    Ouyang, Yongpeng
    Rao, Jianfeng
    Jiang, Qibao
    [J]. APPLIED GEOCHEMISTRY, 2020, 122 (122)
  • [10] Estimating prawn abundance and catchability from catch-effort data: comparison of fixed and random effects models using maximum likelihood and hierarchical Bayesian methods
    Zhou, Shijie
    Vance, David J.
    Dichmont, Catherine M.
    Burridge, Charis Y.
    Toscas, Peter J.
    [J]. MARINE AND FRESHWATER RESEARCH, 2008, 59 (01) : 1 - 9