UPPER BOUNDS ON THE MINIMUM COVERAGE PROBABILITY OF CONFIDENCE INTERVALS IN REGRESSION AFTER MODEL SELECTION

被引:12
|
作者
Kabaila, Paul [1 ]
Giri, Khageswor [1 ]
机构
[1] La Trobe Univ, Dept Math & Stat, Bundoora, Vic 3086, Australia
关键词
Adjusted R(2)-statistic; AIC; 'best subset'; regression; BIC; Mallows' criterion; t-tests; VARIABLE SELECTION; INFERENCE; REJECTION; PRETEST; ERROR;
D O I
10.1111/j.1467-842X.2009.00544.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
P>We consider a linear regression model, with the parameter of interest a specified linear combination of the components of the regression parameter vector. We suppose that, as a first step, a data-based model selection (e.g. by preliminary hypothesis tests or minimizing the Akaike information criterion - AIC) is used to select a model. It is common statistical practice to then construct a confidence interval for the parameter of interest, based on the assumption that the selected model had been given to us a priori. This assumption is false, and it can lead to a confidence interval with poor coverage properties. We provide an easily computed finite-sample upper bound (calculated by repeated numerical evaluation of a double integral) to the minimum coverage probability of this confidence interval. This bound applies for model selection by any of the following methods: minimum AIC, minimum Bayesian information criterion (BIC), maximum adjusted R(2), minimum Mallows' C(P) and t-tests. The importance of this upper bound is that it delineates general categories of design matrices and model selection procedures for which this confidence interval has poor coverage properties. This upper bound is shown to be a finite-sample analogue of an earlier large-sample upper bound due to Kabaila and Leeb.
引用
收藏
页码:271 / 287
页数:17
相关论文
共 50 条