Debiased lasso for generalized linear models with a diverging number of covariates

被引:6
|
作者
Xia, Lu [1 ]
Nan, Bin [2 ]
Li, Yi [3 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[2] Univ Calif Irvine, Dept Stat, Irvine, CA 92717 USA
[3] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
基金
美国国家卫生研究院;
关键词
asymptotics; bias correction; high-dimensional regression; lung cancer; statistical inference; NONCONCAVE PENALIZED LIKELIHOOD; GENOME-WIDE ASSOCIATION; P-REGRESSION PARAMETERS; VARIABLE SELECTION; LUNG-CANCER; CONFIDENCE-INTERVALS; ASYMPTOTIC-BEHAVIOR; M-ESTIMATORS; SUSCEPTIBILITY; REGULARIZATION;
D O I
10.1111/biom.13587
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Modeling and drawing inference on the joint associations between single-nucleotide polymorphisms and a disease has sparked interest in genome-wide associations studies. In the motivating Boston Lung Cancer Survival Cohort (BLCSC) data, the presence of a large number of single nucleotide polymorphisms of interest, though smaller than the sample size, challenges inference on their joint associations with the disease outcome. In similar settings, we find that neither the debiased lasso approach (van de Geer et al., 2014), which assumes sparsity on the inverse information matrix, nor the standard maximum likelihood method can yield confidence intervals with satisfactory coverage probabilities for generalized linear models. Under this "large n, diverging p" scenario, we propose an alternative debiased lasso approach by directly inverting the Hessian matrix without imposing the matrix sparsity assumption, which further reduces bias compared to the original debiased lasso and ensures valid confidence intervals with nominal coverage probabilities. We establish the asymptotic distributions of any linear combinations of the parameter estimates, which lays the theoretical ground for drawing inference. Simulations show that the proposed refined debiased estimating method performs well in removing bias and yields honest confidence interval coverage. We use the proposed method to analyze the aforementioned BLCSC data, a large-scale hospital-based epidemiology cohort study investigating the joint effects of genetic variants on lung cancer risks.
引用
收藏
页码:344 / 357
页数:14
相关论文
共 50 条
  • [41] Variable selection in partially linear additive hazards model with grouped covariates and a diverging number of parameters
    Arfan Raheen Afzal
    Jing Yang
    Xuewen Lu
    [J]. Computational Statistics, 2021, 36 : 829 - 855
  • [42] Generalized Linear Models With Coarsened Covariates: A Practical Bayesian Approach
    Johnson, Timothy R.
    Wiest, Michelle M.
    [J]. PSYCHOLOGICAL METHODS, 2014, 19 (02) : 281 - 299
  • [43] Non-ignorable missing covariates in generalized linear models
    Lipsitz, SR
    Ibrahim, JG
    Chen, MH
    Peterson, H
    [J]. STATISTICS IN MEDICINE, 1999, 18 (17-18) : 2435 - 2448
  • [44] Latent covariates in generalized linear models: A Rasch model approach
    Christensen, Karl Bang
    [J]. ADVANCES IN STATISTICAL METHODS FOR THE HEALTH SCIENCES: APPLICATIONS TO CANCER AND AIDS STUDIES, GENOME SEQUENCE ANALYSIS, AND SURVIVAL ANALYSIS, 2007, : 95 - 108
  • [45] Bayesian analysis for generalized linear models with nonignorably missing covariates
    Huang, L
    Chen, MH
    Ibrahim, JG
    [J]. BIOMETRICS, 2005, 61 (03) : 767 - 780
  • [46] Generalized linear mixed models with informative dropouts and missing covariates
    Wu, Kunling
    Wu, Lang
    [J]. METRIKA, 2007, 66 (01) : 1 - 18
  • [47] Model selection of generalized partially linear models with missing covariates
    Fu, Ying-Zi
    Chen, Xue-Dong
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (01) : 126 - 138
  • [48] Robust methods for generalized linear models with nonignorable missing covariates
    Sinha, Sanjoy K.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2008, 36 (02): : 277 - 299
  • [49] Bayesian methods for generalized linear models with covariates missing at random
    Ibrahim, JG
    Chen, MH
    Lipsitz, SR
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2002, 30 (01): : 55 - 78
  • [50] Generalized linear mixed models with informative dropouts and missing covariates
    Kunling Wu
    Lang Wu
    [J]. Metrika, 2007, 66 : 1 - 18