CANONICAL THRESHOLDING FOR NONSPARSE HIGH-DIMENSIONAL LINEAR REGRESSION

被引:0
|
作者
Silin, Igor [1 ]
Fan, Jianqing [1 ]
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
来源
ANNALS OF STATISTICS | 2022年 / 50卷 / 01期
关键词
High-dimensional linear regression; covariance eigenvalues decay; thresholding; relative errors; principal component regression; VARIABLE SELECTION; WAVELET; PREDICTION; SHRINKAGE; LASSO; SLOPE;
D O I
10.1214/21-AOS2116
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider a high-dimensional linear regression problem. Unlike many papers on the topic, we do not require sparsity of the regression coefficients; instead, our main structural assumption is a decay of eigenvalues of the covariance matrix of the data. We propose a new family of estimators, called the canonical thresholding estimators, which pick largest regression coefficients in the canonical form. The estimators admit an explicit form and can be linked to LASSO and Principal Component Regression (PCR). A theoretical analysis for both fixed design and random design settings is provided. Obtained bounds on the mean squared error and the prediction error of a specific estimator from the family allow to clearly state sufficient conditions on the decay of eigenvalues to ensure convergence. In addition, we promote the use of the relative errors, strongly linked with the out-of-sample R-2. The study of these relative errors leads to a new concept of joint effective dimension, which incorporates the covariance of the data and the regression coefficients simultaneously, and describes the complexity of a linear regression problem. Some minimax lower bounds are established to showcase the optimality of our procedure. Numerical simulations confirm good performance of the proposed estimators compared to the previously developed methods.
引用
收藏
页码:460 / 486
页数:27
相关论文
共 50 条
  • [1] TESTABILITY OF HIGH-DIMENSIONAL LINEAR MODELS WITH NONSPARSE STRUCTURES
    Bradic, Jelena
    Fan, Jianqing
    Zhu, Yinchu
    [J]. ANNALS OF STATISTICS, 2022, 50 (02): : 615 - 639
  • [2] HIGH-DIMENSIONAL LINEAR REGRESSION WITH HARD THRESHOLDING REGULARIZATION: THEORY AND ALGORITHM
    Kang, Lican
    Lai, Yanming
    Liu, Yanyan
    Luo, Yuan
    Zhang, Jing
    [J]. JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2023, 19 (03) : 2104 - 2122
  • [3] Variational Inference in high-dimensional linear regression
    Mukherjee, Sumit
    Sen, Subhabrata
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [4] ACCURACY ASSESSMENT FOR HIGH-DIMENSIONAL LINEAR REGRESSION
    Cai, T. Tony
    Guo, Zijian
    [J]. ANNALS OF STATISTICS, 2018, 46 (04): : 1807 - 1836
  • [5] Prediction in abundant high-dimensional linear regression
    Cook, R. Dennis
    Forzani, Liliana
    Rothman, Adam J.
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 3059 - 3088
  • [6] Elementary Estimators for High-Dimensional Linear Regression
    Yang, Eunho
    Lozano, Aurelie C.
    Ravikumar, Pradeep
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 388 - 396
  • [7] A Note on High-Dimensional Linear Regression With Interactions
    Hao, Ning
    Zhang, Hao Helen
    [J]. AMERICAN STATISTICIAN, 2017, 71 (04): : 291 - 297
  • [8] Thresholding least-squares inference in high-dimensional regression models
    Giurcanu, Mihai
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (02): : 2124 - 2156
  • [9] Distributional results for thresholding estimators in high-dimensional Gaussian regression models
    Poetscher, Benedikt M.
    Schneider, Ulrike
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2011, 5 : 1876 - 1934
  • [10] High-dimensional linear regression via implicit regularization
    Zhao, Peng
    Yang, Yun
    He, Qiao-Chu
    [J]. BIOMETRIKA, 2022, 109 (04) : 1033 - 1046