Stability of feature selection algorithms: a study on high-dimensional spaces

被引:381
|
作者
Kalousis, Alexandros [1 ]
Prados, Julien [1 ]
Hilario, Melanie [1 ]
机构
[1] Univ Geneva, Dept Comp Sci, CH-1211 Geneva 4, Switzerland
关键词
feature selection; high dimensionality; feature stability;
D O I
10.1007/s10115-006-0040-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the proliferation of extremely high-dimensional data, feature selection algorithms have become indispensable components of the learning process. Strangely, despite extensive work on the stability of learning algorithms, the stability of feature selection algorithms has been relatively neglected. This study is an attempt to fill that gap by quantifying the sensitivity of feature selection algorithms to variations in the training set. We assess the stability of feature selection algorithms based on the stability of the feature preferences that they express in the form of weights-scores, ranks, or a selected feature subset. We examine a number of measures to quantify the stability of feature preferences and propose an empirical way to estimate them. We perform a series of experiments with several feature selection algorithms on a set of proteomics datasets. The experiments allow us to explore the merits of each stability measure and create stability profiles of the feature selection algorithms. Finally, we show how stability profiles can support the choice of a feature selection algorithm.
引用
收藏
页码:95 / 116
页数:22
相关论文
共 50 条
  • [1] Stability of feature selection algorithms: a study on high-dimensional spaces
    Alexandros Kalousis
    Julien Prados
    Melanie Hilario
    [J]. Knowledge and Information Systems, 2007, 12 : 95 - 116
  • [2] Feature Selection for High-Dimensional Data: The Issue of Stability
    Pes, Barbara
    [J]. 2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 170 - 175
  • [3] Analytical and Experimental Study of Filter Feature Selection Algorithms for High-dimensional Datasets
    Pino, Adrian
    Morell, Carlos
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY, KNOWLEDGE MANAGEMENT AND DECISION SUPPORT (EUREKA-2013), 2013, 51 : 339 - 349
  • [4] SUBMODULAR FEATURE SELECTION FOR HIGH-DIMENSIONAL ACOUSTIC SCORE SPACES
    Liu, Yuzong
    Wei, Kai
    Kirchhoff, Katrin
    Song, Yisong
    Bilmes, Jeff
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7184 - 7188
  • [5] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    CHEN ZeHua1 & CHEN JiaHua2 1 Department of Statistics & Applied Probability
    [J]. Science China Mathematics, 2009, (06) : 1327 - 1341
  • [6] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    CHEN ZeHua CHEN JiaHua Department of Statistics Applied ProbabilityNational University of Singapore Science Drive Singapore Department of StatisticsUniversity of British ColumbiaVancouverBCVT ZCanada
    [J]. Science in China(Series A:Mathematics), 2009, 52 (06) - 1341
  • [7] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    ZeHua Chen
    JiaHua Chen
    [J]. Science in China Series A: Mathematics, 2009, 52 : 1327 - 1341
  • [8] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    Chen Zehua
    Chen JiaHua
    [J]. SCIENCE IN CHINA SERIES A-MATHEMATICS, 2009, 52 (06): : 1327 - 1341
  • [9] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    [J]. Computational Management Science, 2009, 6 (1) : 25 - 40
  • [10] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75