Fast robust variable selection using VIF regression in large datasets

被引:2
|
作者
Seo, Han Son [1 ]
机构
[1] Konkuk Univ, Dept Appl Stat, 120 Neungdong Ro, Seoul 05029, South Korea
关键词
large dataset; linear regression; stagewise regression; variable selection;
D O I
10.5351/KJAS.2018.31.4.463
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Variable selection algorithms for linear regression models of large data are considered. Many algorithms are proposed focusing on the speed and the robustness of algorithms. Among them variance inflation factor (VIF) regression is fast and accurate due to the use of a streamwise regression approach. But a VIF regression is susceptible to outliers because it estimates a model by a least-square method. A robust criterion using a weighted estimator has been proposed for the robustness of algorithm; in addition, a robust VIF regression has also been proposed for the same purpose. In this article a fast and robust variable selection method is suggested via a VIF regression with detecting and removing potential outliers. A simulation study and an analysis of a dataset are conducted to compare the suggested method with other methods.
引用
收藏
页码:463 / 473
页数:11
相关论文
共 50 条
  • [21] Variable selection in robust regression models for longitudinal data
    Fan, Yali
    Qin, Guoyou
    Zhu, Zhongyi
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 109 : 156 - 167
  • [22] Variable Selection Linear Regression for Robust Speech Recognition
    Tsao, Yu
    Hu, Ting-Yao
    Sakti, Sakriani
    Nakamura, Satoshi
    Lee, Lin-shan
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06) : 1477 - 1487
  • [23] Robust Variable Selection and Estimation in Threshold Regression Model
    Bo-wen LI
    Yun-qi ZHANG
    Nian-sheng TANG
    [J]. Acta Mathematicae Applicatae Sinica, 2020, 36 (02) : 332 - 346
  • [24] Robust Variable Selection and Estimation in Threshold Regression Model
    Li, Bo-wen
    Zhang, Yun-qi
    Tang, Nian-sheng
    [J]. ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2020, 36 (02): : 332 - 346
  • [25] Robust variable selection for mixture linear regression models
    Jiang, Yunlu
    [J]. HACETTEPE JOURNAL OF MATHEMATICS AND STATISTICS, 2016, 45 (02): : 549 - 559
  • [26] Robust estimation and variable selection in heteroscedastic linear regression
    Gijbels, I.
    Vrinssen, I.
    [J]. STATISTICS, 2019, 53 (03) : 489 - 532
  • [27] Robust Bayesian nonparametric variable selection for linear regression
    Cabezas, Alberto
    Battiston, Marco
    Nemeth, Christopher
    [J]. STAT, 2024, 13 (02):
  • [28] Robust variable selection for finite mixture regression models
    Tang, Qingguo
    Karunamuni, R. J.
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2018, 70 (03) : 489 - 521
  • [29] Improving the Stability of the Variable Selection with Small Datasets in Classification and Regression Tasks
    Silvia Cateni
    Valentina Colla
    Marco Vannucci
    [J]. Neural Processing Letters, 2023, 55 : 5331 - 5356
  • [30] Improving the Stability of the Variable Selection with Small Datasets in Classification and Regression Tasks
    Cateni, Silvia
    Colla, Valentina
    Vannucci, Marco
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (05) : 5331 - 5356