Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification

被引:39
|
作者
Tian, David [1 ,2 ]
Zeng, Xiao-jun [2 ]
Keane, John [2 ]
机构
[1] Sheffield Hallam Univ, Dept Comp, Fac ACES, Sheffield S1 1WB, S Yorkshire, England
[2] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England
关键词
Core-generating approximate minimum entropy discretization; Rough set feature selection; Pattern classification; Constraint satisfaction optimization problems; REDUCTION;
D O I
10.1016/j.ijar.2011.03.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rough set feature selection (RSFS) can be used to improve classifier performance. RSFS removes redundant attributes whilst retaining important ones that preserve the classification power of the original dataset. Reducts are feature subsets selected by RSFS. Core is the intersection of all the reducts of a dataset. RSFS can only handle discrete attributes, hence, continuous attributes need to be discretized before being input to RSFS. Discretization determines the core size of a discrete dataset. However, current discretization methods do not consider the core size during discretization. Earlier work has proposed core-generating approximate minimum entropy discretization (C-GAME) algorithm which selects the maximum number of minimum entropy cuts capable of generating a non-empty core within a discrete dataset. The contributions of this paper are as follows: (1) the C-GAME algorithm is improved by adding a new type of constraint to eliminate the possibility that only a single reduct is present in a C-GAME-discrete dataset; (2) performance evaluation of C-GAME in comparison to C4.5, multi-layer perceptrons. RBF networks and k-nearest neighbours classifiers on ten datasets chosen from the UCI Machine Learning Repository: (3) performance evaluation of C-GAME in comparison to Recursive Minimum Entropy Partition (RMEP), Chimerge, Boolean Reasoning and Equal Frequency discretization algorithms on the ten datasets; (4) evaluation of the effects of C-GAME and the other four discretization methods on the sizes of reducts; (5) an upper bound is defined on the total number of reducts within a dataset; (6) the effects of different discretization algorithms on the total number of reducts are analysed and (7) performance analysis of two RSFS algorithms (a genetic algorithm and Johnson's algorithm). (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:863 / 880
页数:18
相关论文
共 50 条
  • [1] Core-generating approximate minimum entropy discretization for rough set feature selection: An experimental investigation
    Tian, David
    Keane, John
    Zeng, Xiao-jun
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 616 - 621
  • [2] Core-Generating Discretization for Rough Set Feature Selection
    Tian, David
    Zeng, Xiao-jun
    Keane, John
    [J]. TRANSACTIONS ON ROUGH SETS XIII, 2011, 6499 : 135 - +
  • [3] Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection
    Mafarja, Majdi M.
    Mirjalili, Seyedali
    [J]. SOFT COMPUTING, 2019, 23 (15) : 6249 - 6265
  • [4] Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection
    Majdi M. Mafarja
    Seyedali Mirjalili
    [J]. Soft Computing, 2019, 23 : 6249 - 6265
  • [5] Feature selection based on rough set and information entropy
    Han, JC
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 153 - 158
  • [6] Data Mining via Discretization, Generalization and Rough Set Feature Selection
    Xiaohua Hu
    Nick Cercone
    [J]. Knowledge and Information Systems, 1999, 1 (1) : 33 - 60
  • [7] Rough Set Based Unsupervised Feature Selection in Mammogram Image Classification Using Entropy Measure
    Thangavel, K.
    Velayutham, C.
    [J]. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2012, 2 (03) : 320 - 326
  • [8] Feature selection in genetic fuzzy discretization for the pattern classification problems
    Choi, Yoon-Seok
    Moon, Byung-Ro
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (07) : 1047 - 1054
  • [9] A Rough Set Approach to Feature Selection Based on Relative Decision Entropy
    Zhou, Lin
    Jiang, Feng
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 110 - 119
  • [10] Feature selection based on multiview entropy measures in multiperspective rough set
    Xu, Jiucheng
    Qu, Kanglin
    Meng, Xiangru
    Sun, Yuanhao
    Hou, Qincheng
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 7200 - 7234