Fast robust variable selection using VIF regression in large datasets

被引:2
|
作者
Seo, Han Son [1 ]
机构
[1] Konkuk Univ, Dept Appl Stat, 120 Neungdong Ro, Seoul 05029, South Korea
关键词
large dataset; linear regression; stagewise regression; variable selection;
D O I
10.5351/KJAS.2018.31.4.463
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Variable selection algorithms for linear regression models of large data are considered. Many algorithms are proposed focusing on the speed and the robustness of algorithms. Among them variance inflation factor (VIF) regression is fast and accurate due to the use of a streamwise regression approach. But a VIF regression is susceptible to outliers because it estimates a model by a least-square method. A robust criterion using a weighted estimator has been proposed for the robustness of algorithm; in addition, a robust VIF regression has also been proposed for the same purpose. In this article a fast and robust variable selection method is suggested via a VIF regression with detecting and removing potential outliers. A simulation study and an analysis of a dataset are conducted to compare the suggested method with other methods.
引用
收藏
页码:463 / 473
页数:11
相关论文
共 50 条