On robust regression with high-dimensional predictors

被引:88
|
作者
El Karoui, Noureddine [1 ]
Bean, Derek [1 ]
Bickel, Peter J. [1 ]
Lim, Chinghway [2 ]
Yu, Bin [1 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Natl Univ Singapore, Fac Sci, Dept Stat & Appl Probabil, Singapore 119077, Singapore
基金
美国国家科学基金会;
关键词
prox function; high-dimensional statistics; concentration of measure;
D O I
10.1073/pnas.1307842110
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but p <= n. We find an exact stochastic representation for the distribution of (beta) over cap = argmin(beta is an element of Rp) Sigma(n)(i=1)rho(Y-i - X-i'beta) at fixed p and n under various assumptions on the objective function rho and our statistical model. A scalar random variable whose deterministic limit r(rho)(kappa) can be studied when p/n -> kappa > 0 plays a central role in this representation. We discover a nonlinear system of two deterministic equations that characterizes r(rho)(kappa). Interestingly, the system shows that r(rho)(kappa) depends on rho through proximal mappings of rho as well as various aspects of the statistical model underlying our study. Several surprising results emerge. In particular, we show that, when p/n is large enough, least squares becomes preferable to least absolute deviations for double-exponential errors.
引用
收藏
页码:14557 / 14562
页数:6
相关论文
共 50 条
  • [21] Robust and sparse estimation methods for high-dimensional linear and logistic regression
    Kurnaz, Fatma Sevinc
    Hoffmann, Irene
    Filzmoser, Peter
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2018, 172 : 211 - 222
  • [22] Heterogeneous robust estimation with the mixed penalty in high-dimensional regression model
    Zhu, Yanling
    Wang, Kai
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (08) : 2730 - 2743
  • [23] Regression on High-dimensional Inputs
    Kuleshov, Alexander
    Bernstein, Alexander
    [J]. 2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 732 - 739
  • [24] On inference in high-dimensional regression
    Battey, Heather S.
    Reid, Nancy
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2023, 85 (01) : 149 - 175
  • [25] Converting high-dimensional regression to high-dimensional conditional density estimation
    Izbicki, Rafael
    Lee, Ann B.
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 2800 - 2831
  • [26] Robust high-dimensional screening
    Kim, Aleksandra
    Mutel, Christopher
    Froemelt, Andreas
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2022, 148
  • [27] Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables
    Silvia Novo
    Germán Aneiros
    Philippe Vieu
    [J]. TEST, 2021, 30 : 481 - 504
  • [28] Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables
    Novo, Silvia
    Aneiros, German
    Vieu, Philippe
    [J]. TEST, 2021, 30 (02) : 481 - 504
  • [29] Robust Coordinate Descent Algorithm Robust Solution Path for High-dimensional Sparse Regression Modeling
    Park, H.
    Konishi, S.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2016, 45 (01) : 115 - 129
  • [30] TransFusion: Covariate-Shift Robust Transfer Learning for High-Dimensional Regression
    He, Zelin
    Sun, Ying
    Liu, Jingyuan
    Li, Runze
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238