Influential data cases when the Cp criterion is used for variable selection in multiple linear regression

被引:3
|
作者
Steel, SJ [1 ]
Uys, DW [1 ]
机构
[1] Univ Stellenbosch, Dept Stat & Actuarial Sci, ZA-7602 Matieland, South Africa
关键词
C-p criterion; influential data cases; multiple linear regression; variable selection;
D O I
10.1016/j.csda.2005.02.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The influence of data cases when the C-p criterion is used for variable selection in multiple linear regression analysis is studied in terms of the predictive power and the predictor variables included in the resulting model when variable selection is applied. In particular, the focus is on the importance of identifying and dealing with these so-called selection influential data cases before model selection and fitting are performed. A new selection influence measure based on the C-p criterion to identify selection influential data cases is developed. The success with which this influence measure identifies selection influential data cases is evaluated in two example data sets. (C) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:1840 / 1854
页数:15
相关论文
共 50 条
  • [21] Fast Improvised Influential Distance for the Identification of Influential Observations in Multiple Linear Regression
    Midi, Habshah
    Sani, Muhammad
    Ismaeel, Shelan Saied
    Arasan, Jayanthi
    [J]. SAINS MALAYSIANA, 2021, 50 (07): : 2085 - 2094
  • [22] Variable selection in multivariate multiple regression
    Variyath, Asokan Mulayath
    Brobbey, Anita
    [J]. PLOS ONE, 2020, 15 (07):
  • [23] A measure of post variable selection error in multiple linear regression, and its estimation
    Steel, SJ
    Oosthuizen, S
    Uys, DW
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2004, 74 (02) : 123 - 133
  • [24] An improved successive projections algorithm version to variable selection in multiple linear regression
    Canova, Luciana dos Santos
    Vallese, Federico Danilo
    Pistonesi, Marcelo Fabian
    Gomes, Adriano de Araujo
    [J]. ANALYTICA CHIMICA ACTA, 2023, 1274
  • [25] Variable Selection by Cp Statistic in Multiple Responses Regression with Fewer Sample Size Than the Dimension
    Yamamura, Mariko
    Yanagihara, Hirokazu
    Srivastava, Muni S.
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT III, 2010, 6278 : 7 - +
  • [26] Quantile regression and variable selection for partially linear model with randomly truncated data
    Hong-Xia Xu
    Zhen-Long Chen
    Jiang-Feng Wang
    Guo-Liang Fan
    [J]. Statistical Papers, 2019, 60 : 1137 - 1160
  • [27] Quantile regression and variable selection for partially linear model with randomly truncated data
    Xu, Hong-Xia
    Chen, Zhen-Long
    Wang, Jiang-Feng
    Fan, Guo-Liang
    [J]. STATISTICAL PAPERS, 2019, 60 (04) : 1137 - 1160
  • [28] Variable selection in partially linear hazard regression for multivariate failure time data
    Liu Jicai
    Zhang, Riquan
    Zhao, Weihua
    Lv, Yazhao
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2016, 28 (02) : 375 - 394
  • [29] Variable Selection in Linear Regression With Many Predictors
    Cai, Airong
    Tsay, Ruey S.
    Chen, Rong
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2009, 18 (03) : 573 - 591
  • [30] Variable selection and transformation in linear regression models
    Yeo, IK
    [J]. STATISTICS & PROBABILITY LETTERS, 2005, 72 (03) : 219 - 226