Overfitting in wrapper-based feature subset selection: The harder you try the worse it gets

被引:57
|
作者
Loughrey, J [1 ]
Cunningham, P [1 ]
机构
[1] Univ Dublin Trinity Coll, Dublin 2, Ireland
关键词
D O I
10.1007/1-84628-102-4_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Wrapper based feature selection, the more states that are visited during the search phase of the algorithm the greater the likelihood of finding a feature subset that has a high internal accuracy while generalizing poorly. When this occurs, we say that the algorithm has overfitted to the training data. We outline a set of experiments to show this and we introduce a modified genetic algorithm to address this overfilling problem by stopping the search before overfilling occurs. This new algorithm called GAWES (Genetic Algorithm With Early Stopping) reduces the level of overfitting and yields feature subsets that have a better generalization accuracy.
引用
收藏
页码:33 / 43
页数:11
相关论文
共 50 条
  • [1] Stability of Filter- and Wrapper-Based Feature Subset Selection
    Wald, Randall
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    [J]. 2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 374 - 380
  • [2] Wrapper-Based Feature Subset Selection for Rapid Image Information Mining
    Durbha, Surya S.
    King, Roger L.
    Younan, Nicolas H.
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2010, 7 (01) : 43 - 47
  • [3] Improving Incremental Wrapper-Based Feature Subset Selection by Using Re-ranking
    Bermejo, Pablo
    Gamez, Jose A.
    Puerta, Jose M.
    [J]. TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 : 580 - 589
  • [4] Wrapper-Based Feature Selection to Classify Flatfoot Disease
    Miguel-Andres, Israel
    Ramos-Frutos, Jorge
    Sharawi, Marwa
    Oliva, Diego
    Reyes-Davila, Elivier
    Casas-Ordaz, Angel
    Perez-Cisneros, Marco
    Zapotecas-Martinez, Saul
    [J]. IEEE ACCESS, 2024, 12 : 22433 - 22447
  • [5] Wrapper-Based Federated Feature Selection for IoT Environments
    Mahanipour, Afsaneh
    Khamfroush, Hana
    [J]. 2023 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2023, : 214 - 219
  • [6] A novel wrapper-based feature subset selection method using modified binary differential evolution algorithm
    Tarkhaneh, Omid
    Thanh Thi Nguyen
    Mazaheri, Samaneh
    [J]. INFORMATION SCIENCES, 2021, 565 : 278 - 305
  • [7] Wrapper-based feature selection: how important is the wrapped classifier?
    Bajer, Drazen
    Dudjak, Mario
    Zoric, Bruno
    [J]. PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON SMART SYSTEMS AND TECHNOLOGIES (SST 2020), 2020, : 97 - 105
  • [8] A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
    Talpur, Noureen
    Abdulkadir, Said Jadid
    Hasan, Mohd Hilmi
    Alhussian, Hitham
    Alwadain, Ayed
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5799 - 5820
  • [9] Wrapper-Based Best Feature Selection Approach for Lung Cancer Detection
    Bishnoi, Vidhi
    Goel, Nidhi
    Tayal, Akash
    [J]. ARTIFICIAL INTELLIGENCE AND SUSTAINABLE COMPUTING FOR SMART CITY, AIS2C2 2021, 2021, 1434 : 175 - 186
  • [10] Incremental Wrapper-based Subset Selection with Replacement: an advantageous alternative to sequential forward selection
    Bermejo, Pablo
    Gamez, Jose A.
    Puerta, Jose M.
    [J]. 2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, : 367 - 374