RaSE: A Variable Screening Framework via Random Subspace Ensembles

被引:5
|
作者
Tian, Ye [1 ]
Feng, Yang [2 ]
机构
[1] Columbia Univ, Dept Stat, New York, NY USA
[2] NYU, Sch Global Publ Hlth, Dept Biostat, New York, NY 10027 USA
关键词
Ensemble learning; High-dimensional data; Random subspace method; Rank consistency; Sure screening property; Variable screening; Variable selection; KOLMOGOROV FILTER; GENE-EXPRESSION; SELECTION; REGRESSION; MODELS;
D O I
10.1080/01621459.2021.1938084
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Variable screening methods have been shown to be effective in dimension reduction under the ultra-high dimensional setting. Most existing screening methods are designed to rank the predictors according to their individual contributions to the response. As a result, variables that are marginally independent but jointly dependent with the response could be missed. In this work, we propose a new framework for variable screening, random subspace ensemble (RaSE), which works by evaluating the quality of random subspaces that may cover multiple predictors. This new screening framework can be naturally combined with any subspace evaluation criterion, which leads to an array of screening methods. The framework is capable to identify signals with no marginal effect or with high-order interaction effects. It is shown to enjoy the sure screening property and rank consistency. We also develop an iterative version of RaSE screening with theoretical support. Extensive simulation studies and real-data analysis show the effectiveness of the new screening framework.
引用
收藏
页码:457 / 468
页数:12
相关论文
共 50 条
  • [31] A linear discriminant analysis framework based on random subspace for face recognition
    Zhang, Xiaoxun
    Jia, Yunde
    PATTERN RECOGNITION, 2007, 40 (09) : 2585 - 2591
  • [32] Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles
    Wang, Qili
    Xu, Wei
    Zheng, Han
    NEUROCOMPUTING, 2018, 299 : 51 - 61
  • [33] Using random subspace method for prediction and variable importance assessment in linear regression
    Mielniczuk, Jan
    Teisseyre, Pawel
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 725 - 742
  • [34] On uniqueness of distribution of a random variable whose independent copies span a subspace in Lp
    Astashkin, S.
    Sukochev, F.
    Zanin, D.
    STUDIA MATHEMATICA, 2015, 230 (01) : 41 - 57
  • [35] On the application of SVM-Ensembles based on adapted random subspace sampling for automatic classification of NMR data
    Lienemann, Kai
    Ploetz, Thomas
    Fink, Gernot A.
    MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, 2007, 4472 : 42 - +
  • [36] RANDOM VARIABLE GENERATION VIA DOUBLE SAMPLING
    KNIGHT, JL
    SATCHELL, SE
    ECONOMETRIC THEORY, 1990, 6 (04) : 487 - 488
  • [37] RANDOM VARIABLE GENERATION VIA DOUBLE SAMPLING
    KNIGHT, JL
    SATCHELL, SE
    ECONOMETRIC THEORY, 1992, 8 (01) : 152 - 155
  • [38] Quality Control of Variable Duration Batch Processes via Subspace Identification
    Corbett, Brandon
    Mhaskar, Prashant
    2017 AMERICAN CONTROL CONFERENCE (ACC), 2017, : 1505 - 1510
  • [39] Attack Agnostic Detection of Adversarial Examples via Random Subspace Analysis
    Drenkow, Nathan
    Fendley, Neil
    Burlina, Philippe
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2815 - 2825
  • [40] Load Shedding against Short-Term Voltage Instability Using Random Subspace Based SVM Ensembles
    Zhu, Lipeng
    Lu, Chao
    Han, Yingduo
    2017 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, 2017,