Sparse model identification and learning for ultra-high-dimensional additive partially linear models

被引:3
|
作者
Li, Xinyi [1 ]
Wang, Li [2 ]
Nettleton, Dan [2 ]
机构
[1] Univ N Carolina, Dept Stat & Operat Res, SAMSI, Chapel Hill, NC 27709 USA
[2] Iowa State Univ, Dept Stat, Ames, IA 50011 USA
关键词
Dimension reduction; Inference for ultra-high-dimensional data; Semiparametric regression; Spline-backfitted local polynomial; Structure identification; Variable selection; VARIABLE SELECTION; DIVERGING NUMBER;
D O I
10.1016/j.jmva.2019.02.010
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The additive partially linear model (APLM) combines the flexibility of nonparametric regression with the parsimony of regression models, and has been widely used as a popular tool in multivariate nonparametric regression to alleviate the "curse of dimensionality". A natural question raised in practice is the choice of structure in the nonparametric part, i.e., whether the continuous covariates enter into the model in linear or nonparametric form. In this paper, we present a comprehensive framework for simultaneous sparse model identification and learning for ultra-high-dimensional APLMs where both the linear and nonparametric components are possibly larger than the sample size. We propose a fast and efficient two-stage procedure. In the first stage, we decompose the nonparametric functions into a linear part and a nonlinear part. The nonlinear functions are approximated by constant spline bases, and a triple penalization procedure is proposed to select nonzero components using adaptive group LASSO. In the second stage, we refit data with selected covariates using higher order polynomial splines, and apply spline-backfitted local-linear smoothing to obtain asymptotic normality for the estimators. The procedure is shown to be consistent for model structure identification. It can identify zero, linear, and nonlinear components correctly and efficiently. Inference can be made on both linear coefficients and nonparametric functions. We conduct simulation studies to evaluate the performance of the method and apply the proposed method to a dataset on the Shoot Apical Meristem (SAM) of maize genotypes for illustration. (C) 2019 Elsevier Inc. All rights reserved.
引用
收藏
页码:204 / 228
页数:25
相关论文
共 50 条
  • [11] Distributed Partially Linear Additive Models With a High Dimensional Linear Part
    Wang, Yue
    Zhang, Weiping
    Lian, Heng
    [J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2021, 7 : 611 - 625
  • [12] Variable selection for ultra-high-dimensional logistic models
    Du, Pang
    Wu, Pan
    Liang, Hua
    [J]. PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 141 - 158
  • [13] Joint feature screening for ultra-high-dimensional sparse additive hazards model by the sparsity-restricted pseudo-score estimator
    Chen, Xiaolin
    Liu, Yi
    Wang, Qihua
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2019, 71 (05) : 1007 - 1031
  • [14] Joint feature screening for ultra-high-dimensional sparse additive hazards model by the sparsity-restricted pseudo-score estimator
    Xiaolin Chen
    Yi Liu
    Qihua Wang
    [J]. Annals of the Institute of Statistical Mathematics, 2019, 71 : 1007 - 1031
  • [15] Robust Model Structure Recovery for Ultra-High-Dimensional Varying-Coefficient Models
    Yang, Jing
    Tian, Guo-Liang
    Lu, Xuewen
    Wang, Mingqiu
    [J]. COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2023,
  • [16] Group selection in high-dimensional partially linear additive models
    Wei, Fengrong
    [J]. BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2012, 26 (03) : 219 - 243
  • [17] Fast Sparse Modeling Technology Opening Up the Future with Ultra-high-dimensional Data
    Ida Y.
    [J]. NTT Technical Review, 2023, 21 (06): : 12 - 16
  • [18] Principled sure independence screening for Cox models with ultra-high-dimensional covariates
    Zhao, Sihai Dave
    Li, Yi
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 105 (01) : 397 - 411
  • [19] Partially Linear Additive Models
    Toledo, Camila G.
    Lopes, Joysce S.
    Ferreira, Clecio S.
    [J]. SIGMAE, 2024, 13 (01): : 24 - 31
  • [20] Spline estimator for ultra-high dimensional partially linear varying coefficient models
    Zhaoliang Wang
    Liugen Xue
    Gaorong Li
    Fei Lu
    [J]. Annals of the Institute of Statistical Mathematics, 2019, 71 : 657 - 677