Model-free feature screening for ultrahigh-dimensional data conditional on some variables

被引:10
|
作者
Liu, Yi [1 ,2 ]
Wang, Qihua [1 ,3 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
[2] China Univ Petr, Coll Sci, Qingdao 266580, Peoples R China
[3] Shenzhen Univ, Inst Stat Sci, Shenzhen 518006, Peoples R China
基金
中国国家自然科学基金;
关键词
Conditional distance correlation; Feature selection; Sure screening property; High-dimensional data; VARYING COEFFICIENT MODELS; DISTANCE CORRELATION; FEATURE-SELECTION;
D O I
10.1007/s10463-016-0597-2
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, the conditional distance correlation (CDC) is used as a measure of correlation to develop a conditional feature screening procedure given some significant variables for ultrahigh-dimensional data. The proposed procedure is model free and is called conditional distance correlation-sure independence screening (CDC-SIS for short). That is, we do not specify any model structure between the response and the predictors, which is appealing in some practical problems of ultrahigh-dimensional data analysis. The sure screening property of the CDC-SIS is proved and a simulation study was conducted to evaluate the finite sample performances. Real data analysis is used to illustrate the proposed method. The results indicate that CDC-SIS performs well.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
  • [1] Model-free feature screening for ultrahigh-dimensional data conditional on some variables
    Yi Liu
    Qihua Wang
    [J]. Annals of the Institute of Statistical Mathematics, 2018, 70 : 283 - 301
  • [2] Model-Free Feature Screening for Ultrahigh-Dimensional Data
    Zhu, Li-Ping
    Li, Lexin
    Li, Runze
    Zhu, Li-Xing
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (496) : 1464 - 1475
  • [3] A Robust Model-Free Feature Screening Method for Ultrahigh-Dimensional Data
    Xue, Jingnan
    Liang, Faming
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2017, 26 (04) : 803 - 813
  • [4] Model-free conditional screening for ultrahigh-dimensional survival data via conditional distance correlation
    Cui, Hengjian
    Liu, Yanyan
    Mao, Guangcai
    Zhang, Jing
    [J]. BIOMETRICAL JOURNAL, 2023, 65 (03)
  • [5] Model-free conditional independence feature screening for ultrahigh dimensional data
    Wang LuHeng
    Liu JingYuan
    Li Yong
    Li RunZe
    [J]. SCIENCE CHINA-MATHEMATICS, 2017, 60 (03) : 551 - 568
  • [6] Model-free conditional independence feature screening for ultrahigh dimensional data
    LuHeng Wang
    JingYuan Liu
    Yong Li
    RunZe Li
    [J]. Science China Mathematics, 2017, 60 : 551 - 568
  • [7] Model-free conditional independence feature screening for ultrahigh dimensional data
    WANG Lu Heng
    LIU Jing Yuan
    LI Yong
    LI Run Ze
    [J]. Science China Mathematics, 2017, 60 (03) : 551 - 568
  • [8] Model-free slice screening for ultrahigh-dimensional survival data
    Zhang, Jing
    Liu, Yanyan
    [J]. JOURNAL OF APPLIED STATISTICS, 2021, 48 (10) : 1755 - 1774
  • [9] Entropy-based model-free feature screening for ultrahigh-dimensional multiclass classification
    Ni, Lyu
    Fang, Fang
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2016, 28 (03) : 515 - 530
  • [10] A NEW MODEL-FREE FEATURE SCREENING PROCEDURE FOR ULTRAHIGH-DIMENSIONAL INTERVAL-CENSORED FAILURE TIME DATA
    Zhang, Jing
    Du, Mingyue
    Liu, Yanyan
    Sun, Jianguo
    [J]. STATISTICA SINICA, 2023, 33 (03) : 1809 - 1830