An evaluation of feature selection methods for environmental data

被引:75
|
作者
Effrosynidis, Dimitrios [1 ]
Arampatzis, Avi [1 ]
机构
[1] Democritus Univ Thrace, Dept Elect & Comp Engn, Database & Informat Retrieval Res Unit, Xanthi 67100, Greece
关键词
Feature selection; Ensemble; Machine learning; Classification; Environmental data; IDENTIFICATION; CLASSIFICATION; FUSION; FILTER;
D O I
10.1016/j.ecoinf.2021.101224
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
We present a comprehensive experimental study of 12 individual as well as 6 ensemble methods for feature selection for classification tasks on environmental data, more specifically on the species distribution modeling domain. The individual methods span all 3 categories, i.e. filter, wrapper, and embedded feature selection. Experiments on 8 environmental datasets show that Shapley Additive Explanations (SHAP) and Permutation Importance are the most effective individual methods, both from the wrapper category. Generally, filter methods perform poorly, and embedded methods fall in-between. Of the 2 machine learning algorithms used, Random Forest and LightGBM, the latter prevailed. Of the 6 ensemble methods considered, i.e. Borda Count, Condorcet, Coombs, Bucklin, Instant Runoff, and Reciprocal Ranking, the last one performs best, outperforming every other method, individual or ensemble, and has a high stability.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Evaluation of five feature selection methods for remote sensing data
    Murni, A
    Mulyono
    Chahyati, D
    [J]. VISUALIZATION AND OPTIMIZATION TECHNIQUES, 2001, 4553 : 196 - 202
  • [2] Analysis and Evaluation of Feature Selection and Feature Extraction Methods
    Nogales, Ruben E.
    Benalcazar, Marco E.
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2023, 16 (01)
  • [3] Analysis and Evaluation of Feature Selection and Feature Extraction Methods
    Rubén E. Nogales
    Marco E. Benalcázar
    [J]. International Journal of Computational Intelligence Systems, 16
  • [4] Data-driven Feature Selection Methods for Text Classification: an Empirical Evaluation
    Fragoso, Rogerio C. P.
    Pinheiro, Roberto H. W.
    Cavalcanti, George D. C.
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2019, 25 (04) : 334 - 360
  • [5] A review of feature selection methods on synthetic data
    Bolon-Canedo, Veronica
    Sanchez-Marono, Noelia
    Alonso-Betanzos, Amparo
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (03) : 483 - 519
  • [6] A survey on feature selection methods for mixed data
    Solorio-Fernandez, Saul
    Carrasco-Ochoa, J. Ariel
    Martinez-Trinidad, Jose Francisco
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2821 - 2846
  • [7] A survey on feature selection methods for mixed data
    Saúl Solorio-Fernández
    J. Ariel Carrasco-Ochoa
    José Francisco Martínez-Trinidad
    [J]. Artificial Intelligence Review, 2022, 55 : 2821 - 2846
  • [8] A review of feature selection methods on synthetic data
    Verónica Bolón-Canedo
    Noelia Sánchez-Maroño
    Amparo Alonso-Betanzos
    [J]. Knowledge and Information Systems, 2013, 34 : 483 - 519
  • [9] Empirical evaluation of feature selection methods in classification
    Cehovin, Luka
    Bosnic, Zoran
    [J]. INTELLIGENT DATA ANALYSIS, 2010, 14 (03) : 265 - 281
  • [10] Feature Selection Methods Evaluation for CTR Estimation
    Miralles-Pechuan, Luis
    Ponce, Hiram
    Martinez-Villasenor, Lourdes
    [J]. 2016 FIFTEENTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (MICAI): ADVANCES IN ARTIFICIAL INTELLIGENCE, 2016, : 57 - 62