Tree structure for efficient data mining using rough sets

被引:61
|
作者
Ananthanarayana, VS [1 ]
Murty, MN [1 ]
Subramanian, DK [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
关键词
PC-tree; single database scan; dynamic mining; segment PC-tree; rough PC-tree; classification; rough set;
D O I
10.1016/S0167-8655(02)00197-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:851 / 862
页数:12
相关论文
共 50 条
  • [1] Double-local rough sets for efficient data mining
    Wang, Guoqiang
    Li, Tianrui
    Zhang, Pengfei
    Huang, Qianqian
    Chen, Hongmei
    [J]. INFORMATION SCIENCES, 2021, 571 : 475 - 498
  • [2] Rough sets as a framework for data mining
    Butalia, A. H.
    Dhore, M. L.
    [J]. IMECS 2007: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2007, : 728 - +
  • [3] Data mining a prostate cancer dataset using rough sets
    Revett, Kenneth
    de Magalhaes, Sergio Tenreiro
    Santos, Henrique A. D.
    [J]. 2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 285 - 288
  • [4] Data mining in intelligent tutoring systems using rough sets
    Attia, SS
    Mahdi, HMK
    Mohammad, HK
    [J]. ICEEC'04: 2004 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTER ENGINEERING, PROCEEDINGS, 2004, : 179 - 184
  • [5] Composite rough sets for dynamic data mining
    Zhang, Junbo
    Li, Tianrui
    Chen, Hongmei
    [J]. INFORMATION SCIENCES, 2014, 257 : 81 - 100
  • [6] Guest Editorial: Rough Sets and Data Mining
    Sakai, Hiroshi
    Nakata, Michinori
    Wu, Wei-Zhi
    Miao, Duoqian
    Wang, Guoyin
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2019, 4 (04) : 201 - 202
  • [7] Neighborhood Rough Sets for Dynamic Data Mining
    Zhang, Junbo
    Li, Tianrui
    Ruan, Da
    Liu, Dun
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2012, 27 (04) : 317 - 342
  • [8] Rough sets for data mining and knowledge discovery
    Komorowski, J
    Polkowski, L
    Skowron, A
    [J]. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1997, 1263 : 393 - 393
  • [9] Mining Approximative Descriptions of Sets Using Rough Sets
    Simovici, Dan A.
    Mimaroglu, Selim
    [J]. ISMVL: 2009 39TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC, 2009, : 66 - 71
  • [10] A novel data structure for efficient representation of large data sets in data mining
    Pai, Radhika M.
    Ananthanarayana, V. S.
    [J]. 2006 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, VOLS 1 AND 2, 2007, : 533 - 538