Variable selection in multivariate linear models with high-dimensional covariance matrix estimation

被引:8
|
作者
Perrot-Dockes, Marie [1 ]
Levy-Leduc, Celine [1 ]
Sansonnet, Laure [1 ]
Chiquet, Julien [1 ]
机构
[1] Univ Paris Saclay, UMR MIA Paris, AgroParisTech, INRA, F-75005 Paris, France
关键词
High-dimensional covariance matrix estimation; Lasso; Multivariate linear model; Variable selection; MAXIMUM-LIKELIHOOD; REGRESSION; LASSO;
D O I
10.1016/j.jmva.2018.02.006
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we propose a novel variable selection approach in the framework of multivariate linear models taking into account the dependence that may exist between the responses. It consists in estimating beforehand the covariance matrix. of the responses and to plug this estimator in a Lasso criterion, in order to obtain a sparse estimator of the coefficient matrix. The properties of our approach are investigated both from a theoretical and a numerical point of view. More precisely, we give general conditions that the estimators of the covariance matrix and its inverse have to satisfy in order to recover the positions of the null and non null entries of the coefficient matrix when the size of Z is not fixed and can tend to infinity. We prove that these conditions are satisfied in the particular case of some Toeplitz matrices. Our approach is implemented in the R package MultiVarSel available from the Comprehensive R Archive Network (CRAN) and is very attractive since it benefits from a low computational load. We also assess the performance of our methodology using synthetic data and compare it with alternative approaches. Our numerical experiments show that including the estimation of the covariance matrix in the Lasso criterion dramatically improves the variable selection performance in many cases. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:78 / 97
页数:20
相关论文
共 50 条
  • [1] Bandwidth Selection for High-Dimensional Covariance Matrix Estimation
    Qiu, Yumou
    Chen, Song Xi
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (511) : 1160 - 1174
  • [2] Variable selection and estimation in high-dimensional models
    Horowitz, Joel L.
    [J]. CANADIAN JOURNAL OF ECONOMICS-REVUE CANADIENNE D ECONOMIQUE, 2015, 48 (02): : 389 - 407
  • [3] BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression
    Zhao, Zhi
    Banterle, Marco
    Bottolo, Leonardo
    Richardson, Sylvia
    Lewin, Alex
    Zucknick, Manuela
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2021, 100 (11): : 1 - 32
  • [4] Estimation in High-Dimensional Analysis and Multivariate Linear Models
    Kollo, Tonu
    Von Rosen, Tatjana
    Von Rosen, Dietrich
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (07) : 1241 - 1253
  • [5] High-dimensional covariance matrix estimation
    Lam, Clifford
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2020, 12 (02)
  • [6] HIGH-DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS
    Fan, Jianqing
    Liao, Yuan
    Mincheva, Martina
    [J]. ANNALS OF STATISTICS, 2011, 39 (06): : 3320 - 3356
  • [7] Estimation and optimal structure selection of high-dimensional Toeplitz covariance matrix
    Yang, Yihe
    Zhou, Jie
    Pan, Jianxin
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 184
  • [8] High-Dimensional Consistencies of KOO Methods for the Selection of Variables in Multivariate Linear Regression Models with Covariance Structures
    Fujikoshi, Yasunori
    Sakurai, Tetsuro
    [J]. MATHEMATICS, 2023, 11 (03)
  • [9] Estimation of the covariance matrix in multivariate partially linear models
    Przystalski, Marcin
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 123 : 380 - 385
  • [10] Variable selection and estimation for high-dimensional spatial autoregressive models
    Cai, Liqian
    Maiti, Tapabrata
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2020, 47 (02) : 587 - 607