Entropy-based model-free feature screening for ultrahigh-dimensional multiclass classification

被引:24
|
作者
Ni, Lyu [1 ]
Fang, Fang [1 ]
机构
[1] East China Normal Univ, Sch Stat, Shanghai 200241, Peoples R China
关键词
entropy; feature screening; information gain; multiclass classification; sure screening property; VARYING COEFFICIENT MODELS; KOLMOGOROV FILTER;
D O I
10.1080/10485252.2016.1167206
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Most feature screening methods for ultrahigh-dimensional classification explicitly or implicitly assume the covariates are continuous. However, in the practice, it is quite common that both categorical and continuous covariates appear in the data, and applicable feature screening method is very limited. To handle this non-trivial situation, we propose an entropy-based feature screening method, which is model free and provides a unified screening procedure for both categorical and continuous covariates. We establish the sure screening and ranking consistency properties of the proposed procedure. We investigate the finite sample performance of the proposed procedure by simulation studies and illustrate the method by a real data analysis.
引用
收藏
页码:515 / 530
页数:16
相关论文
共 50 条
  • [21] Feature screening in ultrahigh-dimensional varying-coefficient Cox model
    Yang, Guangren
    Zhang, Ling
    Li, Runze
    Huang, Yuan
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 171 : 284 - 297
  • [22] Grouped feature screening for ultrahigh-dimensional classification via Gini distance correlation
    Sang, Yongli
    Dang, Xin
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2024, 204
  • [23] Feature screening for ultrahigh-dimensional additive logistic models
    Wang, Lei
    Ma, Xuejun
    Zhang, Jingxiao
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2020, 205 : 306 - 317
  • [24] Feature space reduction method for ultrahigh-dimensional, multiclass data: random forest-based multiround screening (RFMS)
    Hanczar, Gergely
    Stippinger, Marcell
    Hanak, David
    Kurbucz, Marcell T.
    Torteli, Oliver M.
    Chripko, Agnes
    Somogyvari, Zoltan
    [J]. MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (04):
  • [25] A selective overview of feature screening for ultrahigh-dimensional data
    Liu JingYuan
    Zhong Wei
    Li RunZe
    [J]. SCIENCE CHINA-MATHEMATICS, 2015, 58 (10) : 2033 - 2054
  • [26] A selective overview of feature screening for ultrahigh-dimensional data
    JingYuan Liu
    Wei Zhong
    RunZe Li
    [J]. Science China Mathematics, 2015, 58 : 1 - 22
  • [27] Independent feature screening for ultrahigh-dimensional models with interactions
    Yunquan Song
    Xuehu Zhu
    Lu Lin
    [J]. Journal of the Korean Statistical Society, 2014, 43 : 567 - 583
  • [28] A selective overview of feature screening for ultrahigh-dimensional data
    LIU JingYuan
    ZHONG Wei
    LI RunZe
    [J]. Science China Mathematics, 2015, 58 (10) : 2033 - 2054
  • [29] Model-free feature screening via distance correlation for ultrahigh dimensional survival data
    Zhang, Jing
    Liu, Yanyan
    Cui, Hengjian
    [J]. STATISTICAL PAPERS, 2021, 62 (06) : 2711 - 2738
  • [30] Model-free feature screening via distance correlation for ultrahigh dimensional survival data
    Jing Zhang
    Yanyan Liu
    Hengjian Cui
    [J]. Statistical Papers, 2021, 62 : 2711 - 2738