A Study on Feature Subset Selection Using Rough Set Theory

被引:3
|
作者
Han, Jianchao [1 ]
机构
[1] Calif State Univ Dominguez Hills, Dept Comp Sci, 1000 E Victoria St, Carson, CA 90747 USA
关键词
Feature Selection; Data Reduction; Rough Set Theory; Information Entropy; Classification;
D O I
10.1166/jama.2012.1018
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Feature subset selection is an important component of knowledge discovery and data mining systems to help reduce the data dimensionality. Rough sets theory provides a mechanism of selecting feature subsets. Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In order to improve the efficiency, we propose a new concept, called relative attribute dependency, with which we present a sufficient and necessary condition of the minimum conditional attribute reduct of a decision table and develop a new computation model to find the minimum reduct of condition attributes. The relative attribute dependency can be calculated by counting the distinct rows of the sub-decision table, instead of generating discernibility functions or positive regions. Thus the computation efficiency of minimum reducts is highly improved. Two novel algorithms to find optimal reducts of condition attributes based on the relative attribute dependency are proposed and implemented using Java, one brute-force algorithm and the other heuristic algorithm using attribute entropy as the heuristic function. The algorithms proposed are experimented with 10 data sets from UCI Machine Learning Repository. We conduct the comparison of data classification using C4.5 with the original their their usefulness and are analyzed for further reserach.
引用
收藏
页码:239 / 249
页数:11
相关论文
共 50 条
  • [1] Fault feature subset selection based on rough set theory
    Zhao, Yueling
    Xu, Lin
    Wang, Jianhui
    Gu, Shusheng
    [J]. Complexity Analysis and Control for Social, Economical and Biological Systems, 2006, 1 : 162 - 171
  • [2] A hybrid genetic algorithm for feature subset selection in rough set theory
    Si-Yuan Jing
    [J]. Soft Computing, 2014, 18 : 1373 - 1382
  • [3] A hybrid genetic algorithm for feature subset selection in rough set theory
    Jing, Si-Yuan
    [J]. SOFT COMPUTING, 2014, 18 (07) : 1373 - 1382
  • [4] Uncertainty optimization based feature subset selection model using rough set and uncertainty theory
    Sinha A.K.
    Shende P.
    Namdev N.
    [J]. International Journal of Information Technology, 2022, 14 (5) : 2723 - 2739
  • [5] Feature selection algorithms using Rough Set Theory
    Caballero, Yail
    Alvarez, Delia
    Bel, Rafael
    Garcia, Maria M.
    [J]. PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 407 - 411
  • [6] Mammography feature selection using rough set theory
    Pethalakshmi, A.
    Thangave, K.
    Jaganathan, P.
    [J]. 2006 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, VOLS 1 AND 2, 2007, : 237 - +
  • [7] In-Database Feature Selection Using Rough Set Theory
    Beer, Frank
    Buehler, Ulrich
    [J]. INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, IPMU 2016, PT II, 2016, 611 : 393 - 407
  • [8] Feature Selection for Medical Dataset Using Rough Set Theory
    Wang, Yan
    Ma, Lizhuang
    [J]. CEA'09: PROCEEDINGS OF THE 3RD WSEAS INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS, 2009, : 68 - +
  • [9] Neighborhood rough set based heterogeneous feature subset selection
    Hu, Qinghua
    Yu, Daren
    Liu, Jinfu
    Wu, Congxin
    [J]. INFORMATION SCIENCES, 2008, 178 (18) : 3577 - 3594
  • [10] Uncertainty and Feature Selection in Rough Set Theory
    Liang, Jiye
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 8 - 15