Influential data cases when the Cp criterion is used for variable selection in multiple linear regression

被引:3
|
作者
Steel, SJ [1 ]
Uys, DW [1 ]
机构
[1] Univ Stellenbosch, Dept Stat & Actuarial Sci, ZA-7602 Matieland, South Africa
关键词
C-p criterion; influential data cases; multiple linear regression; variable selection;
D O I
10.1016/j.csda.2005.02.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The influence of data cases when the C-p criterion is used for variable selection in multiple linear regression analysis is studied in terms of the predictive power and the predictor variables included in the resulting model when variable selection is applied. In particular, the focus is on the importance of identifying and dealing with these so-called selection influential data cases before model selection and fitting are performed. A new selection influence measure based on the C-p criterion to identify selection influential data cases is developed. The success with which this influence measure identifies selection influential data cases is evaluated in two example data sets. (C) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:1840 / 1854
页数:15
相关论文
共 50 条