A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data

被引:5
|
作者
Zubair, Iqbal Muhammad [1 ]
Kim, Byunghoon [1 ]
机构
[1] Hanyang Univ, Dept Ind & Management Engn, Ansan, South Korea
来源
IEEE ACCESS | 2022年 / 10卷
基金
新加坡国家研究基金会;
关键词
Dimension reduction; feature extraction; group feature ranking; group feature selection; high dimensional data; CANCER; ROBUST; REGRESSION; ENSEMBLE; GENES;
D O I
10.1109/ACCESS.2022.3225685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.
引用
下载
收藏
页码:125136 / 125147
页数:12
相关论文
共 50 条
  • [1] An efficient multivariate feature ranking method for gene selection in high-dimensional microarray data
    Lee, Junghye
    Choi, In Young
    Jun, Chi-Hyuck
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
  • [2] A hybrid feature selection method for high-dimensional data
    Taheri, Nooshin
    Nezamabadi-pour, Hossein
    2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 141 - 145
  • [3] High-dimensional Data Dimension Reduction Based on KECA
    Hu, Yongde
    Pan, Jingchang
    Tan, Xin
    SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, PTS 1-4, 2013, 303-306 : 1101 - 1104
  • [4] Ranking-based Feature Selection with Wrapper PSO Search in High-Dimensional Data Classification
    Saw, Thinzar
    Oo, Win Mar
    IAENG International Journal of Computer Science, 2023, 50 (01)
  • [5] RANKING-BASED VARIABLE SELECTION FOR HIGH-DIMENSIONAL DATA
    Baranowski, Rafal
    Chen, Yining
    Fryzlewicz, Piotr
    STATISTICA SINICA, 2020, 30 (03) : 1485 - 1516
  • [6] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [7] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    Computational Management Science, 2009, 6 (1) : 25 - 40
  • [8] A hybrid feature selection approach based on ensemble method for high-dimensional data
    Rouhi, Amirreza
    Nezamabadi-pour, Hossein
    2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 16 - 20
  • [9] An ensemble feature selection method for high-dimensional data based on sort aggregation
    Wang, Jie
    Xu, Jing
    Zhao, Chengan
    Peng, Yan
    Wang, Hongpeng
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2019, 7 (02) : 32 - 39
  • [10] Feature selection based on geometric distance for high-dimensional data
    Lee, J. -H.
    Oh, S. -Y.
    ELECTRONICS LETTERS, 2016, 52 (06) : 473 - 474