Complexity Measures Effectiveness in Feature Selection

被引:6
|
作者
Okimoto, Lucas Chesini [1 ]
Savii, Ricardo Manhaes [1 ]
Lorena, Ana Carolina [1 ]
机构
[1] UNIFESP ICT, Sao Jose Dos Campos, Brazil
基金
巴西圣保罗研究基金会;
关键词
Complexity Measures; Feature Selection; Classification;
D O I
10.1109/BRACIS.2017.66
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is an important pre-processing step usually mandatory in data analysis by Machine Learning techniques. Its objective is to reduce data dimensionality by removing irrelevant and redundant features from a dataset. In this work we investigate how the presence of irrelevant features in a dataset affects the complexity of a classification problem solution. This is performed by monitoring the values of some complexity measures extracted from the original and preprocessed datasets. These descriptors allow estimating the intrinsic difficulty of a classification problem. Some of these measures are then used in feature ranking. The results are promising and reveal that the complexity measures are indeed suitable for estimating feature importance in classification datasets.
引用
收藏
页码:91 / 96
页数:6
相关论文
共 50 条
  • [1] Data complexity measures in feature selection
    Okimoto, Lucas C.
    Lorena, Ana C.
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [2] Using Data Complexity Measures for Thresholding in Feature Selection Rankers
    Seijo-Pardo, Borja
    Bolon-Canedo, Veronica
    Alonso-Betanzos, Amparo
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2016, 2016, 9868 : 121 - 131
  • [3] Feature selection for domain adaptation using complexity measures and swarm intelligence
    Castillo-Garcia, G.
    Moran-Fernandez, L.
    Bolon-Canedo, V.
    [J]. NEUROCOMPUTING, 2023, 548
  • [4] Feature selection based on sparse representation with the measures of classification error rate and complexity of boundary
    Deng, Yanli
    Jin, Weidong
    [J]. OPTIK, 2015, 126 (20): : 2634 - 2639
  • [5] Centralized vs. distributed feature selection methods based on data complexity measures
    Moran-Fernandez, L.
    Bolon-Canedo, V.
    Alonso-Betanzos, A.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 117 : 27 - 45
  • [6] Consistency measures for feature selection
    Antonio Arauzo-Azofra
    Jose Manuel Benitez
    Juan Luis Castro
    [J]. Journal of Intelligent Information Systems, 2008, 30 : 273 - 292
  • [7] Consistency measures for feature selection
    Arauzo-Azofra, Antonio
    Manuel Benitez, Jose
    Luis Castro, Juan
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2008, 30 (03) : 273 - 292
  • [8] The Complexity of Feature Selection for Consistent Biclustering
    Kundakcioglu, O. Erhun
    Pardalos, Panos M.
    [J]. CLUSTER CHALLENGES IN BIOLOGICAL NETWORKS, 2009, : 257 - 266
  • [9] Feature Selection Under a Complexity Constraint
    Plasberg, Jan H.
    Kleijn, W. Bastiaan
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (03) : 565 - 571
  • [10] Revisiting Feature Selection with Data Complexity
    Ngan Thi Dong
    Khosla, Megha
    [J]. 2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 211 - 216