Minimum Distance Lasso for robust high-dimensional regression

被引:13
|
作者
Lozano, Aurelie C. [1 ]
Meinshausen, Nicolai [2 ]
Yang, Eunho [1 ]
机构
[1] IBM TJ Watson Res Ctr, 1101 Kitchawan Rd, Yorktown Hts, NY 10598 USA
[2] ETH, Seminar Stat, Raemistr 101, CH-8092 Zurich, Switzerland
来源
ELECTRONIC JOURNAL OF STATISTICS | 2016年 / 10卷 / 01期
关键词
Lasso; robust estimation; high-dimensional variable selection; sparse learning; VARIABLE SELECTION; SHRINKAGE; RISK;
D O I
10.1214/16-EJS1136
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a minimum distance estimation method for robust regression in sparse high-dimensional settings. Likelihood-based estimators lack resilience against outliers and model misspecification, a critical issue when dealing with high-dimensional noisy data. Our method, Minimum Distance Lasso (MD-Lasso), combines minimum distance functionals customarily used in nonparametric estimation for robustness, with l(1)-regularization. MD-Lasso is governed by a scaling parameter capping the influence of outliers: the loss is locally convex and close to quadratic for small squared residuals, and flattens for squared residuals larger than the scaling parameter. As the parameter approaches infinity the estimator becomes equivalent to least-squares Lasso. MD-Lasso is able to maintain the robustness of minimum distance functionals in sparse high-dimensional regression. The estimator achieves maximum breakdown point and enjoys consistency with fast convergence rates under mild conditions on the model error distribution. These hold for any solution in a convexity region around the true parameter and in certain cases for every solution. We provide an alternative set of results that do not require the solutions to lie within the convexity region but where the l(2)-norm of the feasible solutions is constrained within a safety radius. Thanks to this constraint, a first-order optimization method is able to produce local optima that are consistent. A connection is established with re-weighted least-squares that intuitively explains MD-Lasso robustness. The merits of our method are demonstrated through simulation and eQTL analysis.
引用
收藏
页码:1296 / 1340
页数:45
相关论文
共 50 条
  • [1] Robust adaptive LASSO in high-dimensional logistic regression
    Basu, Ayanendranath
    Ghosh, Abhik
    Jaenada, Maria
    Pardo, Leandro
    [J]. STATISTICAL METHODS AND APPLICATIONS, 2024,
  • [2] Localized Lasso for High-Dimensional Regression
    Yamada, Makoto
    Takeuchi, Koh
    Iwata, Tomoharu
    Shawe-Taylor, John
    Kaski, Samuel
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 54, 2017, 54 : 325 - 333
  • [3] High-dimensional robust inference for Cox regression models using desparsified Lasso
    Kong, Shengchun
    Yu, Zhuqing
    Zhang, Xianyang
    Cheng, Guang
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2021, 48 (03) : 1068 - 1095
  • [4] Influence Diagnostics for High-Dimensional Lasso Regression
    Rajaratnam, Bala
    Roberts, Steven
    Sparks, Doug
    Yu, Honglin
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (04) : 877 - 890
  • [5] ADAPTIVE LASSO FOR SPARSE HIGH-DIMENSIONAL REGRESSION MODELS
    Huang, Jian
    Ma, Shuangge
    Zhang, Cun-Hui
    [J]. STATISTICA SINICA, 2008, 18 (04) : 1603 - 1618
  • [6] LASSO Isotone for High-Dimensional Additive Isotonic Regression
    Fang, Zhou
    Meinshausen, Nicolai
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2012, 21 (01) : 72 - 91
  • [7] Spline-Lasso in High-Dimensional Linear Regression
    Guo, Jianhua
    Hu, Jianchang
    Jing, Bing-Yi
    Zhang, Zhen
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 288 - 297
  • [8] ASYMPTOTIC ANALYSIS OF HIGH-DIMENSIONAL LAD REGRESSION WITH LASSO
    Gao, Xiaoli
    Huang, Jian
    [J]. STATISTICA SINICA, 2010, 20 (04) : 1485 - 1506
  • [9] On robust regression with high-dimensional predictors
    El Karoui, Noureddine
    Bean, Derek
    Bickel, Peter J.
    Lim, Chinghway
    Yu, Bin
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) : 14557 - 14562
  • [10] The joint lasso: high-dimensional regression for group structured data
    Dondelinger, Frank
    Mukherjee, Sach
    [J]. BIOSTATISTICS, 2020, 21 (02) : 219 - 235