A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data

被引:5
|
作者
Zubair, Iqbal Muhammad [1 ]
Kim, Byunghoon [1 ]
机构
[1] Hanyang Univ, Dept Ind & Management Engn, Ansan, South Korea
来源
IEEE ACCESS | 2022年 / 10卷
基金
新加坡国家研究基金会;
关键词
Dimension reduction; feature extraction; group feature ranking; group feature selection; high dimensional data; CANCER; ROBUST; REGRESSION; ENSEMBLE; GENES;
D O I
10.1109/ACCESS.2022.3225685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.
引用
下载
收藏
页码:125136 / 125147
页数:12
相关论文
共 50 条
  • [21] A strong intuitionistic fuzzy feature association map-based feature selection technique for high-dimensional data
    Das, Amit Kumar
    Goswami, Saptarsi
    Chakrabarti, Amlan
    Chakraborti, Basabi
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2020, 45 (01):
  • [22] A strong intuitionistic fuzzy feature association map-based feature selection technique for high-dimensional data
    Amit Kumar Das
    Saptarsi Goswami
    Amlan Chakrabarti
    Basabi Chakraborti
    Sādhanā, 2020, 45
  • [23] A feature group weighting method for subspace clustering of high-dimensional data
    Chen, Xiaojun
    Ye, Yunming
    Xu, Xiaofei
    Huang, Joshua Zhexue
    PATTERN RECOGNITION, 2012, 45 (01) : 434 - 446
  • [24] Multivariate Feature Ranking With High-Dimensional Data for Classification Tasks
    Jimenez, Fernando
    Sanchez, Gracia
    Palma, Jose
    Miralles-Pechuan, Luis
    Botia, Juan A.
    IEEE ACCESS, 2022, 10 : 60421 - 60437
  • [25] Feature Selection and Feature Stability Measurement Method for High-Dimensional Small Sample Data Based on Big Data Technology
    Huang, Chengyuan
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [26] A GA-based Feature Selection for High-dimensional Data Clustering
    Sun, Mei
    Xiong, Langhuan
    Sun, Haojun
    Jiang, Dazhi
    THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 769 - 772
  • [27] A Parallel Coordinates Plot Method Based on Unsupervised Feature Selection for High-Dimensional Data Visualization
    Lou, Jiaqi
    Dong, Ke
    Wang, Maosen
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 532 - 536
  • [28] A Feature Subset Selection Method Based On High-Dimensional Mutual Information
    Zheng, Yun
    Kwoh, Chee Keong
    ENTROPY, 2011, 13 (04) : 860 - 901
  • [29] Neighborhood Component Feature Selection for High-Dimensional Data
    Yang, Wei
    Wang, Kuanquan
    Zuo, Wangmeng
    JOURNAL OF COMPUTERS, 2012, 7 (01) : 161 - 168
  • [30] Efficient feature selection filters for high-dimensional data
    Ferreira, Artur J.
    Figueiredo, Mario A. T.
    PATTERN RECOGNITION LETTERS, 2012, 33 (13) : 1794 - 1804