Unified model-free interaction screening via CV-entropy filter

被引:0
|
作者
Xiong, Wei [1 ]
Chen, Yaxian [2 ]
Ma, Shuangge [3 ]
机构
[1] Univ Int Business & Econ, Sch Stat, Beijing 100872, Peoples R China
[2] Univ Hong Kong, Dept Stat & Actuarial Sci, Hong Kong, Peoples R China
[3] Yale Sch Publ Hlth, Dept Biostat, New Haven, CT USA
关键词
Coefficient of variation; Conditional entropy; Interaction analysis; Marginal screening; GENE-GENE; VARIABLE SELECTION; ASSOCIATION; REGRESSION; EPISTASIS; TESTS;
D O I
10.1016/j.csda.2022.107684
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
For many practical high-dimensional problems, interactions have been increasingly found to play important roles beyond main effects. A representative example is gene-gene interaction. Joint analysis, which analyzes all interactions and main effects in a single model, can be seriously challenged by high dimensionality. For high-dimensional data analysis in general, marginal screening has been established as effective for reducing computational cost, increasing stability, and improving estimation/selection performance. Most of the existing marginal screening methods are designed for the analysis of main effects only. The existing screening methods for interaction analysis are often limited by making stringent model assumptions, lacking robustness, and/or requiring predictors to be continuous (and hence lacking flexibility). A unified marginal screening approach tailored to interaction analysis is developed, which can be applied to regression, classification, and survival analysis. Predictors are allowed to be continuous and discrete. The proposed approach is built on Coefficient of Variation (CV) filters based on information entropy. Statistical properties are rigorously established. It is shown that the CV filters are almost insensitive to the distribution tails of predictors, correlation structure among predictors, and sparsity level of signals. An efficient two-stage algorithm is developed to make the proposed approach scalable to ultrahigh-dimensional data. Simulations and the analysis of TCGA LUAD data further establish the practical superiority of the proposed approach. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Model-free screening for variables with treatment interaction
    Bizuayehu, Shiferaw B.
    Xu, Jin
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (10) : 1845 - 1859
  • [2] THE FUSED KOLMOGOROV FILTER: A NONPARAMETRIC MODEL-FREE SCREENING METHOD
    Mai, Qing
    Zou, Hui
    [J]. ANNALS OF STATISTICS, 2015, 43 (04): : 1471 - 1497
  • [3] The concordance filter: an adaptive model-free feature screening procedure
    Cheng, Xuewei
    Li, Gang
    Wang, Hong
    [J]. COMPUTATIONAL STATISTICS, 2024, 39 (05) : 2413 - 2436
  • [4] Model-free sure screening via maximum correlation
    Huang, Qiming
    Zhu, Yu
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 148 : 89 - 106
  • [5] Model-Free Forward Screening Via Cumulative Divergence
    Zhou, Tingyou
    Zhu, Liping
    Xu, Chen
    Li, Runze
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (531) : 1393 - 1405
  • [6] The cumulative Kolmogorov filter for model-free screening in ultrahigh dimensional data
    Kim, Arlene Kyoung Hee
    Shin, Seung Jun
    [J]. STATISTICS & PROBABILITY LETTERS, 2017, 126 : 238 - 243
  • [7] Model-free conditional screening via conditional distance correlation
    Jun Lu
    Lu Lin
    [J]. Statistical Papers, 2020, 61 : 225 - 244
  • [8] Robust model-free feature screening via quantile correlation
    Ma, Xuejun
    Zhang, Jingxiao
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 143 : 472 - 480
  • [9] Model-free conditional screening via conditional distance correlation
    Lu, Jun
    Lin, Lu
    [J]. STATISTICAL PAPERS, 2020, 61 (01) : 225 - 244
  • [10] An efficient model-free approach to interaction screening for high dimensional data
    Xiong, Wei
    Pan, Han
    Wang, Jianrong
    Tian, Maozai
    [J]. STATISTICS IN MEDICINE, 2023, 42 (10) : 1583 - 1605