NONPARAMETRIC INDEPENDENCE SCREENING AND STRUCTURE IDENTIFICATION FOR ULTRA-HIGH DIMENSIONAL LONGITUDINAL DATA

被引:54
|
作者
Cheng, Ming-Yen [1 ]
Honda, Toshio [2 ]
Li, Jialiang [3 ]
Peng, Heng [4 ]
机构
[1] Natl Taiwan Univ, Dept Math, Taipei 106, Taiwan
[2] Hitotsubashi Univ, Grad Sch Econ, Kunitachi, Tokyo 1868601, Japan
[3] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117546, Singapore
[4] Hong Kong Baptist Univ, Dept Math, Kowloon, Hong Kong, Peoples R China
来源
ANNALS OF STATISTICS | 2014年 / 42卷 / 05期
关键词
Independence screening; longitudinal data; B-spline; SCAD; sparsity; oracle property; VARYING-COEFFICIENT MODELS; NONCONCAVE PENALIZED LIKELIHOOD; PARTIALLY LINEAR-MODELS; VARIABLE SELECTION; SEMIPARAMETRIC ESTIMATION; NP-DIMENSIONALITY; ORACLE PROPERTIES; COX MODELS; REGRESSION; LASSO;
D O I
10.1214/14-AOS1236
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Ultra-high dimensional longitudinal data are increasingly common and the analysis is challenging both theoretically and methodologically. We offer a new automatic procedure for finding a sparse semivarying coefficient model, which is widely accepted for longitudinal data analysis. Our proposed method first reduces the number of covariates to a moderate order by employing a screening procedure, and then identifies both the varying and constant coefficients using a group SCAD estimator, which is subsequently refined by accounting for the within-subject correlation. The screening procedure is based on working independence and B-spline marginal models. Under weaker conditions than those in the literature, we show that with high probability only irrelevant variables will be screened out, and the number of selected variables can be bounded by a moderate order. This allows the desirable sparsity and oracle properties of the subsequent structure identification step. Note that existing methods require some kind of iterative screening in order to achieve this, thus they demand heavy computational effort and consistency is not guaranteed. The refined semivarying coefficient model employs profile least squares, local linear smoothing and nonparametric covariance estimation, and is semiparametric efficient. We also suggest ways to implement the proposed methods, and to select the tuning parameters. An extensive simulation study is summarized to demonstrate its finite sample performance and the yeast cell cycle data is analyzed.
引用
收藏
页码:1819 / 1849
页数:31
相关论文
共 50 条
  • [1] Nonparametric independence screening for ultra-high dimensional generalized varying coefficient models with longitudinal data
    Zhang, Shen
    Zhao, Peixin
    Li, Gaorong
    Xu, Wangli
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 171 : 37 - 52
  • [2] Nonparametric independence screening for ultra-high-dimensional longitudinal data under additive models
    Niu, Yong
    Zhang, Riquan
    Liu, Jicai
    Li, Huapeng
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2018, 30 (04) : 884 - 905
  • [3] Conditional distance correlation sure independence screening for ultra-high dimensional survival data
    Lu, Shuiyun
    Chen, Xiaolin
    Wang, Hong
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (08) : 1936 - 1953
  • [4] Sparsity identification in ultra-high dimensional quantile regression models with longitudinal data
    Gao, Xianli
    Liu, Qiang
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2020, 49 (19) : 4712 - 4736
  • [5] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models
    Fan, Jianqing
    Feng, Yang
    Song, Rui
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) : 544 - 557
  • [6] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models
    Fan, Jianqing
    Ma, Yunbei
    Dai, Wei
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1270 - 1284
  • [7] A robust variable screening procedure for ultra-high dimensional data
    Ghosh, Abhik
    Thoresen, Magne
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (08) : 1816 - 1832
  • [8] A sure independence screening procedure for ultra-high dimensional partially linear additive models
    Kazemi, M.
    Shahsavani, D.
    Arashi, M.
    [J]. JOURNAL OF APPLIED STATISTICS, 2019, 46 (08) : 1385 - 1403
  • [9] On correlation rank screening for ultra-high dimensional competing risks data
    Chen, Xiaolin
    Li, Chenguang
    Zhang, Tao
    Gao, Zhenlong
    [J]. JOURNAL OF APPLIED STATISTICS, 2022, 49 (07) : 1848 - 1864
  • [10] Grouped variable screening for ultra-high dimensional data for linear model
    Qiu, Debin
    Ahn, Jeongyoun
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 144