A general framework for boosting feature subset selection algorithms

被引:14
|
作者
Perez-Rodriguez, Javier [1 ]
de Haro-Garcia, Aida [1 ]
Romero del Castillo, Juan A. [1 ]
Garcia-Pedrajas, Nicolas [1 ]
机构
[1] Univ Cordoba, Campus Rabanales, Cordoba 14011, Spain
关键词
Feature selection; Boosting; Classifier ensembles; CLASSIFICATION; ENSEMBLES;
D O I
10.1016/j.inffus.2018.03.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is one of the most important tasks in many machine learning and data mining problems. Due to the increasing size of the problems, removing useless, erroneous or noisy features is frequently an initial step that is performed before other data mining algorithms are applied. The aim is to reproduce, or even improve, the performance of the data mining algorithm when all the features are used. Furthermore, the selection of the most relevant features may offer the expert valuable information about the problem to be solved. Over the past few decades, many different feature selection algorithms have been proposed, each with its own strengths and weaknesses. However, as in the case of classification, it is unlikely that a single feature selection algorithm would be able to achieve good results across many different datasets and application fields. Furthermore, when we are dealing with thousands of features, the most powerful feature selection methods are frequently too time consuming to be applied. In classification, one of the most successful ways of consistently improving the performance of a single weak learner is to construct ensembles using boosting methods. In this paper, we propose a general framework for feature selection boosting in the same way boosting is applied to classification. The proposed approach opens a new field of research in which to apply the many techniques developed for boosting classifiers. Using 120 datasets, the experiments reported show a clear improvement in several state-of-the-art feature selection algorithms using the proposed methodology.
引用
收藏
页码:147 / 175
页数:29
相关论文
共 50 条
  • [31] featsel: A framework for benchmarking of feature selection algorithms and cost functions
    Reis, Marcelo S.
    Estrela, Gustavo
    Ferreira, Carlos Eduardo
    Barrera, Junior
    [J]. SOFTWAREX, 2017, 6 : 193 - 197
  • [32] Investigating the Effect of Fixing the Subset Length using Ant Colony Optimization Algorithms for Feature Subset Selection Problems
    Abd-Alsabour, Nadia
    Randall, Marcus
    Lewis, Andrew
    [J]. 2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, : 733 - 738
  • [33] A selection framework of sensor combination feature subset for human motion phase segmentation
    Wang, Jiaxin
    Wang, Zhelong
    Qiu, Sen
    Xu, Jian
    Zhao, Hongyu
    Fortino, Giancarlo
    Habib, Masood
    [J]. INFORMATION FUSION, 2021, 70 : 1 - 11
  • [34] A Novel Subset Feature Selection Framework for Increasing the Classification Performance of SONAR Targets
    Potharaju, Sai Prasad
    Sreedevi, M.
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS, 2018, 125 : 902 - 909
  • [35] Boosting instance selection algorithms
    Garcia-Pedrajas, Nicolas
    de Haro-Garcia, Aida
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 67 : 342 - 360
  • [36] A boosting algorithm with subset selection of training patterns
    Nakashima, T
    Nakai, G
    Ishibuchi, H
    [J]. PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 690 - 695
  • [37] Application of feature subset selection based on evolutionary algorithms for automatic emotion recognition in speech
    Alvarez, Aitor
    Cearreta, Idoia
    Lopez, Juan Miguel
    Arruti, Andoni
    Lazkano, Elena
    Sierra, Basilio
    Garay, Nestor
    [J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 273 - 281
  • [38] Feature subset selection for face detection using genetic algorithms and particle swarm optimization
    Shoorehdeli, Mahdi Aliyari
    Teshnehlab, Mohammad
    Moghaddam, H. Abrishami
    [J]. PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL, 2006, : 686 - 690
  • [39] Feature selection using ModifiedBoostARoota and prediction of heart diseases using Gradient Boosting algorithms
    Anuradha, P.
    David, Vasantha Kalyani
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 19 - 23
  • [40] Online Feature Selection with Capricious Streaming Features: A General Framework
    Wu, Di
    He, Yi
    Luo, Xin
    Shang, Mingsheng
    Wu, Xindong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 683 - 688