A robust variable screening procedure for ultra-high dimensional data

被引:5
|
作者
Ghosh, Abhik [1 ]
Thoresen, Magne [2 ]
机构
[1] Indian Stat Inst, Interdisciplinary Stat Res Unit, Kolkata, India
[2] Univ Oslo, Dept Biostat, Oslo Ctr Biostat & Epidemiol, Oslo, Norway
关键词
Variable selection; NP dimensionality; independence screening; minimum density power divergence estimator; influence function; gene selection; DENSITY POWER DIVERGENCE; SELECTION; MODELS; LASSO;
D O I
10.1177/09622802211017299
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre-screening of the variables may be necessary. A number of procedures for such pre-screening has been developed; among them the Sure Independence Screening (SIS) enjoys some popularity. However, SIS is vulnerable to outliers in the data, and in particular in small samples this may lead to faulty inference. In this paper, we develop a new robust screening procedure. We build on the density power divergence (DPD) estimation approach and introduce DPD-SIS and its extension iterative DPD-SIS. We illustrate the behavior of the methods through extensive simulation studies and show that they are superior to both the original SIS and other robust methods when there are outliers in the data. Finally, we illustrate its use in a study on regulation of lipid metabolism.
引用
收藏
页码:1816 / 1832
页数:17
相关论文
共 50 条
  • [31] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hyokyoung G. Hong
    Jian Kang
    Yi Li
    [J]. Lifetime Data Analysis, 2018, 24 : 45 - 71
  • [32] Uniform joint screening for ultra-high dimensional graphical models
    Zheng, Zemin
    Shi, Haiyu
    Li, Yang
    Yuan, Hui
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 179
  • [33] Adjusted feature screening for ultra-high dimensional missing response
    Zou, Liying
    Liu, Yi
    Zhang, Zhonghu
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (03) : 460 - 483
  • [34] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hong, Hyokyoung G.
    Kang, Jian
    Li, Yi
    [J]. LIFETIME DATA ANALYSIS, 2018, 24 (01) : 45 - 71
  • [35] Model-free conditional feature screening for ultra-high dimensional right censored data
    Chen, Xiaolin
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (12) : 2425 - 2446
  • [36] Nonparametric independence screening for ultra-high dimensional generalized varying coefficient models with longitudinal data
    Zhang, Shen
    Zhao, Peixin
    Li, Gaorong
    Xu, Wangli
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 171 : 37 - 52
  • [37] The fused Kolmogorov–Smirnov screening for ultra-high dimensional semi-competing risks data
    Liu, Yi
    Chen, Xiaolin
    Wang, Hong
    [J]. Applied Mathematical Modelling, 2021, 98 : 109 - 120
  • [38] BOLT-SSI: A STATISTICAL APPROACH TO SCREENING INTERACTION EFFECTS FOR ULTRA-HIGH DIMENSIONAL DATA
    Zhou, Min
    Dai, Mingwei
    Yao, Yuan
    Liu, Jin
    Yang, Can
    Peng, Heng
    [J]. STATISTICA SINICA, 2023, 33 (04) : 2327 - 2358
  • [39] Forward variable selection for ultra-high dimensional quantile regression models
    Honda, Toshio
    Lin, Chien-Tong
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2023, 75 (03) : 393 - 424
  • [40] Forward variable selection for ultra-high dimensional quantile regression models
    Toshio Honda
    Chien-Tong Lin
    [J]. Annals of the Institute of Statistical Mathematics, 2023, 75 : 393 - 424