Feature Screening via Distance Correlation Learning

被引:582
|
作者
Li, Runze [2 ,3 ]
Zhong, Wei [1 ]
Zhu, Liping [4 ]
机构
[1] Xiamen Univ, Dept Stat, Fujian Key Lab Stat Sci, Wang Yanan Inst Studies Econ, Xiamen 361005, Peoples R China
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[3] Penn State Univ, Methodol Ctr, University Pk, PA 16802 USA
[4] Shanghai Univ Finance & Econ, Sch Stat & Management, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
Sure independence screening; Sure screening property; Ultrahigh dimensionality; Variable selection; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; REGRESSION; PATHWAYS; CANCERS; MODELS;
D O I
10.1080/01621459.2012.695654
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article is concerned with screening features in ultrahigh-dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS). The DC-SIS can be implemented as easily as the sure independence screening (SIS) procedure based on the Pearson correlation proposed by Fan and Lv. However, the DC-SIS can significantly improve the SIS. Fan and Lv established the sure screening property for the SIS based on linear models, but the sure screening property is valid for the DC-SIS under more general settings, including linear models. Furthermore, the implementation of the DC-SIS does not require model specification (e.g., linear model or generalized linear model) for responses or predictors. This is a very appealing property in ultrahigh-dimensional data analysis. Moreover, the DC-SIS can be used directly to screen grouped predictor variables and multivariate response variables. We establish the sure screening property for the DC-SIS, and conduct simulations to examine its finite sample performance. A numerical comparison indicates that the DC-SIS performs much better than the SIS in various models. We also illustrate the DC-SIS through a real-data example.
引用
收藏
页码:1129 / 1139
页数:11
相关论文
共 50 条
  • [1] A note on quantile feature screening via distance correlation
    Xiaolin Chen
    Xiaojing Chen
    Yi Liu
    [J]. Statistical Papers, 2019, 60 : 1741 - 1762
  • [2] A note on quantile feature screening via distance correlation
    Chen, Xiaolin
    Chen, Xiaojing
    Liu, Yi
    [J]. STATISTICAL PAPERS, 2019, 60 (05) : 1741 - 1762
  • [3] Feature screening via Bergsma-Dassios sign correlation learning
    He, Daojiang
    Hao, Xinxin
    Xu, Kai
    He, Lei
    Liu, Youxin
    [J]. STATISTICS AND ITS INTERFACE, 2021, 14 (04) : 417 - 430
  • [4] Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation
    Sang, Yongli
    Dang, Xin
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2024, 204
  • [5] FEATURE SCREENING VIA DISTANCE CORRELATION FOR ULTRAHIGH DIMENSIONAL DATA WITH RESPONSES MISSING AT RANDOM
    Xia, Linli
    Tang, Niansheng
    [J]. STATISTICA SINICA, 2023, 33 : 1169 - 1191
  • [6] Model-free feature screening via distance correlation for ultrahigh dimensional survival data
    Zhang, Jing
    Liu, Yanyan
    Cui, Hengjian
    [J]. STATISTICAL PAPERS, 2021, 62 (06) : 2711 - 2738
  • [7] Model-free feature screening via distance correlation for ultrahigh dimensional survival data
    Jing Zhang
    Yanyan Liu
    Hengjian Cui
    [J]. Statistical Papers, 2021, 62 : 2711 - 2738
  • [8] Robust feature screening for ultra-high dimensional right censored data via distance correlation
    Chen, Xiaolin
    Chen, Xiaojing
    Wang, Hong
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 119 : 118 - 138
  • [9] Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation
    Zhao, Shaofei
    Fu, Guifang
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 192
  • [10] Learning Fair Representations via Distance Correlation Minimization
    Guo, Dandan
    Wang, Chaojie
    Wang, Baoxiang
    Zha, Hongyuan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2139 - 2152