On robust regression with high-dimensional predictors

被引:88
|
作者
El Karoui, Noureddine [1 ]
Bean, Derek [1 ]
Bickel, Peter J. [1 ]
Lim, Chinghway [2 ]
Yu, Bin [1 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Natl Univ Singapore, Fac Sci, Dept Stat & Appl Probabil, Singapore 119077, Singapore
基金
美国国家科学基金会;
关键词
prox function; high-dimensional statistics; concentration of measure;
D O I
10.1073/pnas.1307842110
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but p <= n. We find an exact stochastic representation for the distribution of (beta) over cap = argmin(beta is an element of Rp) Sigma(n)(i=1)rho(Y-i - X-i'beta) at fixed p and n under various assumptions on the objective function rho and our statistical model. A scalar random variable whose deterministic limit r(rho)(kappa) can be studied when p/n -> kappa > 0 plays a central role in this representation. We discover a nonlinear system of two deterministic equations that characterizes r(rho)(kappa). Interestingly, the system shows that r(rho)(kappa) depends on rho through proximal mappings of rho as well as various aspects of the statistical model underlying our study. Several surprising results emerge. In particular, we show that, when p/n is large enough, least squares becomes preferable to least absolute deviations for double-exponential errors.
引用
收藏
页码:14557 / 14562
页数:6
相关论文
共 50 条
  • [1] Interpolating Predictors in High-Dimensional Factor Regression
    Bunea, Florentina
    Strimas-Mackey, Seth
    Wegkamp, Marten
    [J]. Journal of Machine Learning Research, 2022, 23
  • [2] Interpolating Predictors in High-Dimensional Factor Regression
    Bunea, Florentina
    Strimas-Mackey, Seth
    Wegkamp, Marten
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [3] Robust Ridge Regression for High-Dimensional Data
    Maronna, Ricardo A.
    [J]. TECHNOMETRICS, 2011, 53 (01) : 44 - 53
  • [4] Scale calibration for high-dimensional robust regression
    Loh, Po-Ling
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (02): : 5933 - 5994
  • [5] High-dimensional regression with ordered multiple categorical predictors
    Huang, Lei
    Hang, Weiqiang
    Chao, Yue
    [J]. STATISTICS IN MEDICINE, 2020, 39 (03) : 294 - 309
  • [6] Robust high-dimensional regression for data with anomalous responses
    Mingyang Ren
    Sanguo Zhang
    Qingzhao Zhang
    [J]. Annals of the Institute of Statistical Mathematics, 2021, 73 : 703 - 736
  • [7] Robust high-dimensional regression for data with anomalous responses
    Ren, Mingyang
    Zhang, Sanguo
    Zhang, Qingzhao
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 703 - 736
  • [8] Minimum Distance Lasso for robust high-dimensional regression
    Lozano, Aurelie C.
    Meinshausen, Nicolai
    Yang, Eunho
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (01): : 1296 - 1340
  • [9] Robust adaptive LASSO in high-dimensional logistic regression
    Basu, Ayanendranath
    Ghosh, Abhik
    Jaenada, Maria
    Pardo, Leandro
    [J]. STATISTICAL METHODS AND APPLICATIONS, 2024,
  • [10] Robust linear regression for high-dimensional data: An overview
    Filzmoser, Peter
    Nordhausen, Klaus
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (04)