Variable selection in regression with compositional covariates

被引:148
|
作者
Lin, Wei [1 ]
Shi, Pixu [1 ]
Feng, Rui [1 ]
Li, Hongzhe [1 ]
机构
[1] Univ Penn, Perelman Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Compositional data; Coordinate descent method of multipliers; High-dimensional regression; Lasso; Log-contrast model; Model selection; Regularization; Sparsity; MICROBIOME DATA-ANALYSIS; LASSO; OBESITY; SHRINKAGE; ECOLOGY; MODEL;
D O I
10.1093/biomet/asu031
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivated by research problems arising in the analysis of gut microbiome and metagenomic data, we consider variable selection and estimation in high-dimensional regression with compositional covariates. We propose an l(1) regularization method for the linear log-contrast model that respects the unique features of compositional data. We formulate the proposed procedure as a constrained convex optimization problem and introduce a coordinate descent method of multipliers for efficient computation. In the high-dimensional setting where the dimensionality grows at most exponentially with the sample size, model selection consistency and l(infinity) bounds for the resulting estimator are established under conditions that are mild and interpretable for compositional data. The numerical performance of our method is evaluated via simulation studies and its usefulness is illustrated by an application to a microbiome study relating human body mass index to gut microbiome composition.
引用
收藏
页码:785 / 797
页数:13
相关论文
共 50 条
  • [1] VARIABLE SELECTION IN NONPARAMETRIC REGRESSION WITH CONTINUOUS COVARIATES
    ZHANG, P
    [J]. ANNALS OF STATISTICS, 1991, 19 (04): : 1869 - 1882
  • [2] VARIABLE SELECTION IN NONPARAMETRIC REGRESSION WITH CATEGORICAL COVARIATES
    BICKEL, P
    PING, Z
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (417) : 90 - 97
  • [3] Variable selection for linear regression models with random covariates
    Nkiet, GM
    [J]. COMPTES RENDUS DE L ACADEMIE DES SCIENCES SERIE I-MATHEMATIQUE, 2001, 333 (12): : 1105 - 1110
  • [4] Quantile regression for compositional covariates
    Ma, Xuejun
    Zhang, Ping
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (03) : 658 - 668
  • [5] Robust regression with compositional covariates
    Mishra, Aditya
    Muller, Christian L.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 165
  • [6] Bayesian variable selection for the Cox regression model with missing covariates
    Joseph G. Ibrahim
    Ming-Hui Chen
    Sungduk Kim
    [J]. Lifetime Data Analysis, 2008, 14 : 496 - 520
  • [7] Imputation and variable selection in linear regression models with missing covariates
    Yang, XW
    Belin, TR
    Boscardin, WJ
    [J]. BIOMETRICS, 2005, 61 (02) : 498 - 506
  • [8] Variable selection in multivariate regression models with measurement error in covariates
    Cui, Jingyu
    Yi, Grace Y.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2024, 202
  • [9] Variable Selection in the Cox Regression Model with Covariates Missing at Random
    Garcia, Ramon I.
    Ibrahim, Joseph G.
    Zhu, Hongtu
    [J]. BIOMETRICS, 2010, 66 (01) : 97 - 104
  • [10] Bayesian variable selection for the Cox regression model with missing covariates
    Ibrahim, Joseph G.
    Chen, Ming-Hui
    Kim, Sungduk
    [J]. LIFETIME DATA ANALYSIS, 2008, 14 (04) : 496 - 520