Multiple testing and variable selection along the path of the least angle regression

被引:0
|
作者
Azais, Jean-Marc [1 ]
De Castro, Yohann [2 ]
机构
[1] Toulouse Univ Paul Sabatier, Inst Math, 118 Route Narbonne, F-31062 Toulouse, France
[2] Ecole Cent Lyon, UMR 5208, Inst Camille Jordan, 36 Ave Guy Collongue, F-69134 Ecully, France
关键词
multiple testing; false discovery rate; high-dimension; selective inference; FALSE DISCOVERY RATE; LASSO; INFERENCE; SLOPE;
D O I
10.1093/imaiai/iaac018
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We investigate multiple testing and variable selection using the Least Angle Regression (LARS) algorithm in high dimensions under the assumption of Gaussian noise. LARS is known to produce a piecewise affine solution path with change points referred to as the knots of the LARS path. The key to our results is an expression in closed form of the exact joint law of a K-tuple of knots conditional on the variables selected by LARS, the so-called post-selection joint law of the LARS knots. Numerical experiments demonstrate the perfect fit of our findings. This paper makes three main contributions. First, we build testing procedures on variables entering the model along the LARS path in the general design case when the noise level can be unknown. These testing procedures are referred to as the Generalized t-Spacing tests and we prove that they have an exact non-asymptotic level (i.e. the Type I error is exactly controlled). This extends work of [31] where the spacing test works for consecutive knots and known variance. Second, we introduce a new exact multiple testing procedure after model selection in the general design case when the noise level may be unknown. We prove that this testing procedure has exact non-asymptotic level for general design and unknown noise level. Third, we prove exact control of the false discovery rate under orthogonal design assumption. Monte-Carlo simulations and a real data experiment are provided to illustrate our results in this case. Of independent interest, we introduce an equivalent formulation of the LARS algorithm based on a recursive function.
引用
收藏
页码:1329 / 1388
页数:60
相关论文
共 50 条
  • [31] Variable selection procedures from multiple testing
    Baoxue Zhang
    Guanghui Cheng
    Chunming Zhang
    Shurong Zheng
    [J]. Science China Mathematics, 2019, 62 : 771 - 782
  • [32] Bootstrapping multiple linear regression after variable selection
    Lasanthi C. R. Pelawa Watagoda
    David J. Olive
    [J]. Statistical Papers, 2021, 62 : 681 - 700
  • [33] Variable selection and inference strategies for multiple compositional regression
    Lee, Sujin
    Jung, Sungkyu
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2024, 248
  • [34] A variable selection proposal for multiple linear regression analysis
    Steel, S. J.
    Uys, D. W.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2011, 81 (12) : 2095 - 2105
  • [35] Bootstrapping multiple linear regression after variable selection
    Pelawa Watagoda, Lasanthi C. R.
    Olive, David J.
    [J]. STATISTICAL PAPERS, 2021, 62 (02) : 681 - 700
  • [36] Regression analysis and variable selection for two-stage multiple-infection group testing data
    Lin, Juexin
    Wang, Dewei
    Zheng, Qi
    [J]. STATISTICS IN MEDICINE, 2019, 38 (23) : 4519 - 4533
  • [37] Sparse partial least squares regression for simultaneous dimension reduction and variable selection
    Chun, Hyonho
    Keles, Suenduez
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2010, 72 : 3 - 25
  • [38] Path model analyzed with ordinary least squares multiple regression versus LISREL
    Kline, TJB
    Klammer, JD
    [J]. JOURNAL OF PSYCHOLOGY, 2001, 135 (02): : 213 - 225
  • [39] Holonomic extended least angle regression
    Härkönen M.
    Sei T.
    Hirose Y.
    [J]. Information Geometry, 2020, 3 (2) : 149 - 181
  • [40] Uncertainty quantification for robust variable selection and multiple testing
    Belitser, Eduard
    Nurushev, Nurzhan
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (02): : 5955 - 5979