A Novel Framework for Fast Feature Selection Based on Multi-Stage Correlation Measures

被引:1
|
作者
Garcia-Ramirez, Ivan-Alejandro [1 ]
Calderon-Mora, Arturo [1 ]
Mendez-Vazquez, Andres [1 ]
Ortega-Cisneros, Susana [2 ]
Reyes-Amezcua, Ivan [1 ]
机构
[1] Inst Politecn Nacl, Dept Comp Sci, Ctr Invest & Estudios Avanzados, Zapopan 45017, Jalisco, Mexico
[2] Inst Politecn Nacl, Dept Elect Syst Design, Ctr Invest & Estudios Avanzados, Zapopan 45017, Jalisco, Mexico
来源
关键词
machine learning; feature selection; !text type='python']python[!/text] framework; MACHINE LEARNING APPLICATIONS; DIMENSIONALITY REDUCTION; MUTUAL INFORMATION; RELEVANCE;
D O I
10.3390/make4010007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Datasets with thousands of features represent a challenge for many of the existing learning methods because of the well known curse of dimensionality. Not only that, but the presence of irrelevant and redundant features on any dataset can degrade the performance of any model where training and inference is attempted. In addition, in large datasets, the manual management of features tends to be impractical. Therefore, the increasing interest of developing frameworks for the automatic discovery and removal of useless features through the literature of Machine Learning. This is the reason why, in this paper, we propose a novel framework for selecting relevant features in supervised datasets based on a cascade of methods where speed and precision are in mind. This framework consists of a novel combination of Approximated and Simulate Annealing versions of the Maximal Information Coefficient (MIC) to generalize the simple linear relation between features. This process is performed in a series of steps by applying the MIC algorithms and cutoff strategies to remove irrelevant and redundant features. The framework is also designed to achieve a balance between accuracy and speed. To test the performance of the proposed framework, a series of experiments are conducted on a large battery of datasets from SPECTF Heart to Sonar data. The results show the balance of accuracy and speed that the proposed framework can achieve.
引用
收藏
页码:131 / 149
页数:19
相关论文
共 50 条
  • [1] Multi-stage convex relaxation for feature selection
    Zhang, Tong
    [J]. BERNOULLI, 2013, 19 (5B) : 2277 - 2293
  • [2] A novel multi-stage feature selection method for microarray expression data analysis
    Du, Wei
    Sun, Ying
    Wang, Yan
    Cao, Zhongbo
    Zhang, Chen
    Liang, Yanchun
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2013, 7 (01) : 58 - 77
  • [3] Attention-based adaptive feature selection for multi-stage image dehazing
    Li, Xiaoling
    Hua, Zhen
    Li, Jinjiang
    [J]. VISUAL COMPUTER, 2023, 39 (02): : 663 - 678
  • [4] Attention-based adaptive feature selection for multi-stage image dehazing
    Xiaoling Li
    Zhen Hua
    Jinjiang Li
    [J]. The Visual Computer, 2023, 39 : 663 - 678
  • [5] Multi-Stage Feature Selection Based Intelligent Classifier for Classification of Incipient Stage Fire in Building
    Andrew, Allan Melvin
    Zakaria, Ammar
    Saad, Shaharil Mad
    Shakaff, Ali Yeon Md
    [J]. SENSORS, 2016, 16 (01)
  • [6] MuSeFFF: Multi-stage feature fusion framework for traffic prediction
    Kumar, Arun
    Sunitha, R.
    [J]. Intelligent Systems with Applications, 2023, 18
  • [7] MOX-NET: Multi-stage deep hybrid feature fusion and selection framework for monkeypox classification
    Maqsood, Sarmad
    Damasevicius, Robertas
    Shahid, Sana
    Forkert, Nils D.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [8] Fast CU Partitioning Algorithm for VVC Based on Multi-Stage Framework and Binary Subnets
    Wang, Yanjun
    Liu, Yong
    Zhao, Jinchao
    Zhang, Qiuwen
    [J]. IEEE ACCESS, 2023, 11 : 56812 - 56821
  • [9] Multi-stage biomedical feature selection extraction algorithm for cancer detection
    Keshta, Ismail
    Deshpande, Pallavi Sagar
    Shabaz, Mohammad
    Soni, Mukesh
    Bhadla, Mohit Kumar
    Muhammed, Yasser
    [J]. SN APPLIED SCIENCES, 2023, 5 (05)
  • [10] Multi-Stage Hybrid Feature Selection Algorithm for Imbalanced Medical Data
    Liu, Jiaxuan
    Li, Daiwei
    Ren, Lijuan
    Zhang, Haiqing
    Chen, Jinjing
    Yang, Rui
    [J]. Computer Engineering and Applications, 61 (02): : 158 - 169