Discretization-Based Feature Selection as a Bilevel Optimization Problem

被引:5
|
作者
Said, Rihab [1 ]
Elarbi, Maha [1 ]
Bechikh, Slim [1 ]
Coello Coello, Carlos Artemio [2 ,3 ,4 ]
Said, Lamjed Ben [1 ]
机构
[1] Univ Tunis, Strategies Modeling & Artificial Intelligence Lab, ISG, Tunis 2000, Tunisia
[2] CINVESTAV IPN, Dept Comp Sci, Evolutionary Computat Grp, Mexico City 07300, Mexico
[3] Basque Ctr Appl Math, Bilbao 48009, Spain
[4] Ikerbasque, Bilbao 48009, Spain
关键词
Bilevel optimization; co-evolutionary algorithm; cut-points search; discretization-based feature selection (DBFS); features interactions; SEARCH;
D O I
10.1109/TEVC.2022.3192113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discretization-based feature selection (DBFS) approaches have shown interesting results when using several metaheuristic algorithms, such as particle swarm optimization (PSO), genetic algorithm (GA), ant colony optimization (ACO), etc. However, these methods share the same shortcoming which consists in encoding the problem solution as a sequence of cut-points. From this cut-points vector, the decision of deleting or selecting any feature is induced. Indeed, the number of generated cut-points varies from one feature to another. Thus, the higher the number of cut-points, the higher the probability of selecting the considered feature; and vice versa. This fact leads to the deletion of possibly important features having a single or a low number of cut-points, such as the infection rate, the glycemia level, and the blood pressure. In order to solve the issue of the dependency relation between the feature selection (or removal) event and the number of its generated potential cut-points, we propose to model the DBFS task as a bilevel optimization problem and then solve it using an improved version of an existing co-evolutionary algorithm, named I-CEMBA. The latter ensures the variation of the number of features during the migration process in order to deal with the multimodality aspect. The resulting algorithm, termed bilevel discretization-based feature selection (Bi-DFS), performs selection at the upper level while discretization is done at the lower level. The experimental results on several high-dimensional datasets show that Bi-DFS outperforms relevant state-of-the-art methods in terms of classification accuracy, generalization ability, and feature selection bias.
引用
收藏
页码:893 / 907
页数:15
相关论文
共 50 条
  • [1] An Improved Discretization-Based Feature Selection via Particle Swarm Optimization
    Lin, Jiping
    Zhou, Yu
    Kang, Junhao
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 298 - 310
  • [2] Feature subset selection via an improved discretization-based particle swarm optimization
    Zhou, Yu
    Lin, Jiping
    Guo, Hainan
    [J]. APPLIED SOFT COMPUTING, 2021, 98
  • [3] A New Representation in PSO for Discretization-Based Feature Selection
    Tran, Binh
    Xue, Bing
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (06) : 1733 - 1746
  • [4] An evolutionary multi-objective optimization framework of discretization-based feature selection for classification
    Zhou, Yu
    Kang, Junhao
    Kwong, Sam
    Wang, Xu
    Zhang, Qingfu
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2021, 60
  • [5] A discretization-based approach for the optimization of the multiperiod blend scheduling problem
    Kolodziej, Scott P.
    Grossmann, Ignacio E.
    Furman, Kevin C.
    Sawaya, Nicolas W.
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2013, 53 : 122 - 142
  • [6] A Cooperative Coevolutionary Approach to Discretization-Based Feature Selection for High-Dimensional Data
    Zhou, Yu
    Kang, Junhao
    Zhang, Xiao
    [J]. ENTROPY, 2020, 22 (06)
  • [7] Discretization-based solution approaches for the circle packing problem
    Taspinar, Rabia
    Kocuk, Burak
    [J]. ENGINEERING OPTIMIZATION, 2024,
  • [8] Bilevel optimization for feature selection in the data-driven newsvendor problem
    Serrano, Breno
    Minner, Stefan
    Schiffer, Maximilian
    Vidal, Thibaut
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 315 (02) : 703 - 714
  • [9] Feature discretization-based deep clustering for thyroid ultrasound image feature extraction
    Yu, Ruiguo
    Tian, Yuan
    Gao, Jie
    Liu, Zhiqiang
    Wei, Xi
    Jiang, Han
    Huang, Yuxiao
    Li, Xuewei
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [10] Discretization-based algorithms for generalized semi-infinite and bilevel programs with coupling equality constraints
    Djelassi, Hatim
    Glass, Moll
    Mitsos, Alexander
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2019, 75 (02) : 341 - 392