Model-free feature screening for high-dimensional survival data

被引:9
|
作者
Lin, Yuanyuan [1 ]
Liu, Xianhui [2 ,3 ]
Hao, Meiling [4 ]
机构
[1] Chinese Univ Hong Kong, Dept Stat, Hong Kong 999077, Hong Kong, Peoples R China
[2] Jiangxi Univ Finance & Econ, Sch Stat, Nanchang 330013, Jiangxi, Peoples R China
[3] Jiangxi Univ Finance & Econ, Res Ctr Appl Stat, Nanchang 330013, Jiangxi, Peoples R China
[4] Univ Hlth Network, Princess Margaret Canc Ctr, Toronto, ON M5G 2M9, Canada
基金
中国国家自然科学基金; 加拿大健康研究院;
关键词
feature screening; random censoring; robustness; sure independence screening; ultra-high dimension; PROPORTIONAL HAZARDS MODEL; VARIABLE SELECTION; HETEROGENEOUS DATA; NP-DIMENSIONALITY; ORACLE PROPERTIES; ADAPTIVE LASSO; COX MODEL; INEQUALITIES; REGRESSION;
D O I
10.1007/s11425-016-9116-6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introduce a unified and robust model-free feature screening approach for high-dimensional survival data with censoring, which has several advantages: it is a model-free approach under a general model framework, and hence avoids the complication to specify an actual model form with huge number of candidate variables; under mild conditions without requiring the existence of any moment of the response, it enjoys the ranking consistency and sure screening properties in ultra-high dimension. In particular, we impose a conditional independence assumption of the response and the censoring variable given each covariate, instead of assuming the censoring variable is independent of the response and the covariates. Moreover, we also propose a more robust variant to the new procedure, which possesses desirable theoretical properties without any finite moment condition of the predictors and the response. The computation of the newly proposed methods does not require any complicated numerical optimization and it is fast and easy to implement. Extensive numerical studies demonstrate that the proposed methods perform competitively for various configurations. Application is illustrated with an analysis of a genetic data set.
引用
收藏
页码:1617 / 1636
页数:20
相关论文
共 50 条
  • [41] Surrogate-variable-based model-free feature screening for survival data under the general censoring mechanism
    Jing Zhang
    Qihua Wang
    Xuan Wang
    [J]. Annals of the Institute of Statistical Mathematics, 2022, 74 : 379 - 397
  • [42] Model-free data screening and cleaning
    Tarter, Michael E.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2011, 3 (02): : 168 - 176
  • [43] The cumulative Kolmogorov filter for model-free screening in ultrahigh dimensional data
    Kim, Arlene Kyoung Hee
    Shin, Seung Jun
    [J]. STATISTICS & PROBABILITY LETTERS, 2017, 126 : 238 - 243
  • [44] Model-free feature screening for ultrahigh dimensional data via a Pearson chi-square based index
    Ma, Weidong
    Xiao, Jingsong
    Yang, Ying
    Ye, Fei
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (15) : 3222 - 3248
  • [45] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    [J]. Computational Management Science, 2009, 6 (1) : 25 - 40
  • [46] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [47] Model-Free Conditional Feature Screening with FDR Control
    Tong, Zhaoxue
    Cai, Zhanrui
    Yang, Songshan
    Li, Runze
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2575 - 2587
  • [48] Model-free conditional feature screening with exposure variables
    Zhou, Yeqing
    Liu, Jingyuan
    Hao, Zhihui
    Zhui, Liping
    [J]. STATISTICS AND ITS INTERFACE, 2019, 12 (02) : 239 - 251
  • [49] A NEW MODEL-FREE FEATURE SCREENING PROCEDURE FOR ULTRAHIGH-DIMENSIONAL INTERVAL-CENSORED FAILURE TIME DATA
    Zhang, Jing
    Du, Mingyue
    Liu, Yanyan
    Sun, Jianguo
    [J]. STATISTICA SINICA, 2023, 33 (03) : 1809 - 1830
  • [50] High-dimensional feature screening for nonlinear associations with survival outcome using restricted mean survival time
    Chen, Yaxian
    Lam, Kwok Fai
    Liu, Zhonghua
    [J]. STAT, 2024, 13 (02):