FULLY EFFICIENT ROBUST ESTIMATION, OUTLIER DETECTION AND VARIABLE SELECTION VIA PENALIZED REGRESSION

被引:18
|
作者
Kong, Dehan [1 ]
Bondell, Howard D. [2 ]
Wu, Yichao [2 ]
机构
[1] Univ Toronto, Dept Stat Sci, Toronto, ON M5S 3G3, Canada
[2] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Adaptive; breakdown point; least trimmed squares; outliers; penalized regression; robust regression; variable selection; LEAST ANGLE REGRESSION; SQUARES REGRESSION; ORACLE PROPERTIES; MODEL SELECTION; HIGH BREAKDOWN; LASSO; LIKELIHOOD; SHRINKAGE;
D O I
10.5705/ss.202016.0441
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper studies the outlier detection and variable selection problem in linear regression. A mean shift parameter is added to the linear model to reflect the effect of outliers, where an outlier has a nonzero shift parameter. We then apply an adaptive regularization to these shift parameters to shrink most of them to zero. Those observations with nonzero mean shift parameter estimates are regarded as outliers. An L1 penalty is added to the regression parameters to select important predictors. We propose an efficient algorithm to solve this jointly penalized optimization problem and use the extended Bayesian information criteria tuning method to select the regularization parameters, since the number of parameters exceeds the sample size. Theoretical results are provided in terms of high breakdown point, full efficiency, as well as outlier detection consistency. We illustrate our method with simulations and data. Our method is extended to high-dimensional problems with dimension much larger than the sample size.
引用
收藏
页码:1031 / 1052
页数:22
相关论文
共 50 条
  • [21] Robust Variable Selection and Estimation in Threshold Regression Model
    Bo-wen LI
    Yun-qi ZHANG
    Nian-sheng TANG
    Acta Mathematicae Applicatae Sinica, 2020, 36 (02) : 332 - 346
  • [22] Robust Variable Selection and Estimation in Threshold Regression Model
    Li, Bo-wen
    Zhang, Yun-qi
    Tang, Nian-sheng
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2020, 36 (02): : 332 - 346
  • [23] Robust estimation and variable selection in heteroscedastic linear regression
    Gijbels, I.
    Vrinssen, I.
    STATISTICS, 2019, 53 (03) : 489 - 532
  • [24] A robust and efficient variable selection method for linear regression
    Yang, Zhuoran
    Fu, Liya
    Wang, You-Gan
    Dong, Zhixiong
    Jiang, Yunlu
    JOURNAL OF APPLIED STATISTICS, 2022, 49 (14) : 3677 - 3692
  • [25] Erratum to: On Estimation and Selection of Autologistic Regression Models via Penalized Pseudolikelihood
    Rao Fu
    Andrew L. Thurman
    Tingjin Chu
    Michelle M. Steen-adams
    Jun Zhu
    Journal of Agricultural, Biological and Environmental Statistics, 2017, 22 : 413 - 419
  • [26] Genetic algorithms for outlier detection and variable selection in linear regression models
    Tolvi, J
    SOFT COMPUTING, 2004, 8 (08) : 527 - 533
  • [27] Genetic algorithms for outlier detection and variable selection in linear regression models
    J. Tolvi
    Soft Computing, 2004, 8 : 527 - 533
  • [28] Penalized variable selection in competing risks regression
    Fu, Zhixuan
    Parikh, Chirag R.
    Zhou, Bingqing
    LIFETIME DATA ANALYSIS, 2017, 23 (03) : 353 - 376
  • [29] Penalized variable selection in competing risks regression
    Zhixuan Fu
    Chirag R. Parikh
    Bingqing Zhou
    Lifetime Data Analysis, 2017, 23 : 353 - 376
  • [30] A fuzzy penalized regression model with variable selection
    Kashani, M.
    Arashi, M.
    Rabiei, M. R.
    D'Urso, P.
    De Giovanni, L.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 175