Data complexity measures in feature selection

被引:0
|
作者
Okimoto, Lucas C. [1 ]
Lorena, Ana C. [2 ]
机构
[1] Univ Fed Sao Paulo UNIFESP, Inst Ciencia & Tecnol ICT, Sao Jose Dos Campos, SP, Brazil
[2] Inst Tecnol Aeronaut ITA, Div Ciencia Comp IEC, Sao Jose Dos Campos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Machine Learning; feature selection; data complexity; EFFICIENT FEATURE-SELECTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection (FS) is a pre-processing step often mandatory in data analysis by Machine Learning techniques. Its objective is to reduce data dimensionality by identifying and maintaining only the relevant features from a dataset. In this work we evaluate the use of complexity measures of classification problems in FS. These descriptors allow estimating the intrinsic difficulty of a classification problem by regarding on characteristics of the dataset available for learning. We propose a combined univariate-multivariate FS technique which employs two complexity measures: Fisher's maximum discriminant ratio and sum of intra-extra class distances. The results reveal that the complexity measures are indeed suitable for estimating feature importance in classification datasets. Large reductions in the numbers of features were obtained, while preserving, in general, the predictive accuracy of two strong classification techniques: Support Vector Machines and Random Forests.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] The Complexity of Feature Selection for Consistent Biclustering
    Kundakcioglu, O. Erhun
    Pardalos, Panos M.
    [J]. CLUSTER CHALLENGES IN BIOLOGICAL NETWORKS, 2009, : 257 - 266
  • [22] Feature Selection Under a Complexity Constraint
    Plasberg, Jan H.
    Kleijn, W. Bastiaan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (03) : 565 - 571
  • [23] Comparison of Stability Measures for Feature Selection
    Drotar, Peter
    Smekal, Zdenek
    [J]. 2015 IEEE 13TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI), 2015, : 71 - 75
  • [24] STABILITY PROPERTIES OF FEATURE SELECTION MEASURES
    Bulinski, A. V.
    [J]. THEORY OF PROBABILITY AND ITS APPLICATIONS, 2024, 69 (01) : 25 - 34
  • [25] Evaluation of Feature Selection Measures for Steganalysis
    Rajput, G. K.
    Agrawal, R. K.
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 432 - 439
  • [26] A Hybrid Feature Selection Method Based on Fuzzy Feature Selection and Consistency Measures
    Jalali, Laleh
    Nasiri, Mahdi
    Minaei, Behrooz
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 1, 2009, : 718 - 722
  • [27] Complexity-constrained feature selection for classification
    Plasberg, Jan H.
    Kleijn, W. Bastiaan
    [J]. ICCE: 2007 DIGEST OF TECHNICAL PAPERS INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2007, : 9 - +
  • [28] On the Complexity of Discrete Feature Selection for Optimal Classification
    Pena, Jose M.
    Nilsson, Roland
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (08) : 1517 - U1522
  • [29] Dynamic selection of classifiers based on complexity measures
    Schmeing, Ederson
    Brun, Andre Luiz
    Silva, Ronan Assumpcao
    [J]. 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 82 - 89
  • [30] A Study on Selection Stability Measures for Various Feature Selection Algorithms
    Chelvan, Mohana P.
    Perumal, K.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2016, : 121 - 124