A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Data

被引:5
|
作者
Zubair, Iqbal Muhammad [1 ]
Kim, Byunghoon [1 ]
机构
[1] Hanyang Univ, Dept Ind & Management Engn, Ansan, South Korea
来源
IEEE ACCESS | 2022年 / 10卷
基金
新加坡国家研究基金会;
关键词
Dimension reduction; feature extraction; group feature ranking; group feature selection; high dimensional data; CANCER; ROBUST; REGRESSION; ENSEMBLE; GENES;
D O I
10.1109/ACCESS.2022.3225685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.
引用
下载
收藏
页码:125136 / 125147
页数:12
相关论文
共 50 条
  • [41] Feature selection for classifying high-dimensional numerical data
    Wu, YM
    Zhang, AD
    PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 251 - 258
  • [42] A robust ensemble feature selection technique for high-dimensional datasets based on minimum weight threshold method
    Guney, Huseyin
    Oztoprak, Huseyin
    COMPUTATIONAL INTELLIGENCE, 2022, 38 (05) : 1616 - 1658
  • [43] An Ant Colony Optimization Based Dimension Reduction Method for High-Dimensional Datasets
    Li, Ying
    Wang, Gang
    Chen, Huiling
    Shi, Lian
    Qin, Lei
    JOURNAL OF BIONIC ENGINEERING, 2013, 10 (02) : 231 - 241
  • [44] An Ant Colony Optimization Based Dimension Reduction Method for High-Dimensional Datasets
    Ying Li
    Gang Wang
    Huiling Chen
    Lian Shi
    Lei Qin
    Journal of Bionic Engineering, 2013, 10 : 231 - 241
  • [45] Local-Learning-Based Feature Selection for High-Dimensional Data Analysis
    Sun, Yijun
    Todorovic, Sinisa
    Goodison, Steve
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) : 1610 - 1626
  • [46] A density-based clustering algorithm for high-dimensional data with feature selection
    Qi Xianting
    Wang Pan
    2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 114 - 118
  • [47] Interaction-based feature selection and classification for high-dimensional biological data
    Wang, Haitian
    Lo, Shaw-Hwa
    Zheng, Tian
    Hu, Inchi
    BIOINFORMATICS, 2012, 28 (21) : 2834 - 2842
  • [48] A new representation in genetic programming with hybrid feature ranking criterion for high-dimensional feature selection
    Jiayi Li
    Fan Zhang
    Jianbin Ma
    Complex & Intelligent Systems, 2025, 11 (4)
  • [49] Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering
    Chu, Zhiguang
    He, Jingsha
    Zhang, Xiaolei
    Zhang, Xing
    Zhu, Nafei
    ELECTRONICS, 2023, 12 (09)
  • [50] Stable feature selection based on brain storm optimisation for high-dimensional data
    Li, Mengmeng
    Liu, Yi
    Zheng, Qibin
    Qin, Wei
    Ren, Xiaoguang
    ELECTRONICS LETTERS, 2022, 58 (01) : 10 - 12