Multiple testing and variable selection along the path of the least angle regression

被引:0
|
作者
Azais, Jean-Marc [1 ]
De Castro, Yohann [2 ]
机构
[1] Toulouse Univ Paul Sabatier, Inst Math, 118 Route Narbonne, F-31062 Toulouse, France
[2] Ecole Cent Lyon, UMR 5208, Inst Camille Jordan, 36 Ave Guy Collongue, F-69134 Ecully, France
关键词
multiple testing; false discovery rate; high-dimension; selective inference; FALSE DISCOVERY RATE; LASSO; INFERENCE; SLOPE;
D O I
10.1093/imaiai/iaac018
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We investigate multiple testing and variable selection using the Least Angle Regression (LARS) algorithm in high dimensions under the assumption of Gaussian noise. LARS is known to produce a piecewise affine solution path with change points referred to as the knots of the LARS path. The key to our results is an expression in closed form of the exact joint law of a K-tuple of knots conditional on the variables selected by LARS, the so-called post-selection joint law of the LARS knots. Numerical experiments demonstrate the perfect fit of our findings. This paper makes three main contributions. First, we build testing procedures on variables entering the model along the LARS path in the general design case when the noise level can be unknown. These testing procedures are referred to as the Generalized t-Spacing tests and we prove that they have an exact non-asymptotic level (i.e. the Type I error is exactly controlled). This extends work of [31] where the spacing test works for consecutive knots and known variance. Second, we introduce a new exact multiple testing procedure after model selection in the general design case when the noise level may be unknown. We prove that this testing procedure has exact non-asymptotic level for general design and unknown noise level. Third, we prove exact control of the false discovery rate under orthogonal design assumption. Monte-Carlo simulations and a real data experiment are provided to illustrate our results in this case. Of independent interest, we introduce an equivalent formulation of the LARS algorithm based on a recursive function.
引用
收藏
页码:1329 / 1388
页数:60
相关论文
共 50 条
  • [1] Variable selection in partial linear regression using the least angle regression
    Seo, Han Son
    Yoon, Min
    Lee, Hakbae
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (06) : 937 - 944
  • [2] Outlier Detection and Robust Variable Selection for Least Angle Regression
    Shahriari, Shirin
    Faria, Susana
    Manuela Goncalves, A.
    Van Aelst, Stefan
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2014, PT III, 2014, 8581 : 512 - +
  • [3] Least angle regression for model selection
    Zhang, Hongyang
    Zamar, Ruben H.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2014, 6 (02): : 116 - 123
  • [4] Robust variable selection using least angle regression and elemental set sampling
    McCann, Lauren
    Welsch, Roy E.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 249 - 257
  • [5] Consistent variable selection in high dimensional regression via multiple testing
    Bunea, Florentina
    Wegkamp, Marten H.
    Auguste, Anna
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2006, 136 (12) : 4349 - 4364
  • [6] Variable Selection Method of NIR Spectroscopy Based on Least Angle Regression and GA-PLS
    Yan Sheng-ke
    Yang Hui-hua
    Hu Bai-chao
    Ren Chao-chao
    Liu Zhen-bing
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2017, 37 (06) : 1733 - 1738
  • [7] Inverse fuzzy fault model for fault detection and isolation with least angle regression for variable selection
    Marquez-Vera, M. A.
    Ramos-Velasco, L. E.
    Lopez-Ortega, O.
    Zuniga-Pena, N. S.
    Ramos-Fernandez, J. C.
    Ortega-Mendoza, R. M.
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 159
  • [8] Variable selection in near infrared spectroscopy based on significance testing in partial least squares regression
    Westad, F
    Martens, H
    [J]. JOURNAL OF NEAR INFRARED SPECTROSCOPY, 2000, 8 (02) : 117 - 124
  • [9] Variable selection in multivariate multiple regression
    Variyath, Asokan Mulayath
    Brobbey, Anita
    [J]. PLOS ONE, 2020, 15 (07):
  • [10] Robust linear model selection based on least angle regression
    Khan, Jafar A.
    Van Aelst, Stefan
    Zamar, Ruben H.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) : 1289 - 1299