Bootstrap model selection

被引:164
|
作者
Shao, J
机构
关键词
autoregressive time series; bootstrap sample size; generalized linear model; nonlinear regression; prediction error;
D O I
10.2307/2291661
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In a regression problem, typically there are p explanatory variables possibly related to a response variable, and we wish to select a subset of the p explanatory variables to fit a model between these variables and the response. A bootstrap variable/model selection procedure is to select the subset of variables by minimizing bootstrap estimates of the prediction error, where the bootstrap estimates are constructed based on a data set of size n. Although the bootstrap estimates have good properties, this bootstrap selection procedure is inconsistent in the sense that the probability of selecting the optimal subset of variables does not converge to 1 as n --> infinity. This inconsistency can be rectified by modifying the sampling method used in drawing bootstrap observations. For bootstrapping pairs (response, explanatory variable), it is found that instead of drawing n bootstrap observations (a customary bootstrap sampling plan), much less bootstrap observations should be sampled: The bootstrap selection procedure becomes consistent if we draw m bootstrap observations with m --> infinity and m/n --> 0. For bootstrapping residuals, we modify the bootstrap sampling procedure by increasing the variability among the bootstrap observations. The consistency of the modified bootstrap selection procedures is established in various situations, including linear models, nonlinear models, generalized linear models, and autoregressive time series. The choice of the bootstrap sample size m and some computational issues are also discussed. Some empirical results are presented.
引用
收藏
页码:655 / 665
页数:11
相关论文
共 50 条
  • [21] Fast bootstrap methodology for regression model selection
    Lendasse, A
    Simon, G
    Wertz, V
    Verleysen, M
    NEUROCOMPUTING, 2005, 64 : 161 - 181
  • [22] Bootstrap model selection for polynomial phase signals
    Zoubir, AM
    Iskander, DR
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 2229 - 2232
  • [23] Bootstrap-after-Bootstrap Model Averaging for Reducing Model Uncertainty in Model Selection for Air Pollution Mortality Studies
    Roberts, Steven
    Martin, Michael A.
    ENVIRONMENTAL HEALTH PERSPECTIVES, 2010, 118 (01) : 131 - 136
  • [24] Bootstrap and backward elimination based approaches for model selection
    el-sallam, AAA
    Kayhan, S
    Zoubir, AM
    ISPA 2003: PROCEEDINGS OF THE 3RD INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, PTS 1 AND 2, 2003, : 152 - 157
  • [25] Model selection in linear regression using paired bootstrap
    Rabbi, Fazli
    Khan, Salahuddin
    Khalil, Alamgir
    Mashwani, Wali Khan
    Shafiq, Muhammad
    Goktas, Pinar
    Unvan, Yuksel Akay
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (07) : 1629 - 1639
  • [26] Bootstrap-based Selection for Instrumental Variables Model
    Wang, Wenjie
    Liu, Qingfeng
    ECONOMICS BULLETIN, 2015, 35 (03): : 1886 - +
  • [27] Balanced bootstrap resampling method for neural model selection
    Hung, Wen-Liang
    Lee, E. Stanley
    Chuang, Shun-Chin
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2011, 62 (12) : 4576 - 4581
  • [28] Bootstrap confidence intervals for reservoir model selection techniques
    Scheidt, Celine
    Caers, Jef
    COMPUTATIONAL GEOSCIENCES, 2010, 14 (02) : 369 - 382
  • [29] Robust model selection using fast and robust bootstrap
    Salibian-Barrera, Matlas
    Van Aelst, Stefan
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (12) : 5121 - 5135
  • [30] UNIFORM ASYMPTOTIC INFERENCE AND THE BOOTSTRAP AFTER MODEL SELECTION
    Tibshirani, Ryan J.
    Rinaldo, Alessandro
    Tibshirani, Rob
    Wasserman, Larry
    ANNALS OF STATISTICS, 2018, 46 (03): : 1255 - 1287