Compressed hierarchical mining of frequent closed patterns from dense data sets

被引:5
|
作者
Ji, Liping [1 ]
Tan, Kian-Lee [1 ]
Tung, Anthony K. H. [1 ]
机构
[1] Natl Univ Singapore, Dept Comp Sci, Singapore 117543, Singapore
关键词
frequent closed patterns; progressive; dense data sets; data mining; parallel mining;
D O I
10.1109/TKDE.2007.1047
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of finding frequent closed patterns ( FCPs) from very dense data sets. We introduce two compressed hierarchical FCP mining algorithms: C-Miner and B-Miner. The two algorithms compress the original mining space, hierarchically partition the whole mining task into independent subtasks, and mine each subtask progressively. The two algorithms adopt different task partitioning strategies: C-Miner partitions the mining task based on Compact Matrix Division, whereas B-Miner partitions the task based on Base Rows Projection. The compressed hierarchical mining algorithms enhance the mining efficiency and facilitate a progressive refinement of results. Moreover, because the subtasks can be mined independently, C-Miner and B-Miner can be readily paralleled without incurring significant communication overhead. We have implemented C-Miner and B-Miner, and our performance study on synthetic data sets and real dense microarray data sets shows their effectiveness over existing schemes. We also report experimental results on parallel versions of these two methods.
引用
收藏
页码:1175 / 1187
页数:13
相关论文
共 50 条
  • [1] HDminer: Efficient Mining of High Dimensional Frequent Closed Patterns from Dense Data
    Xu, Jianpeng
    Ji, Shufan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 1061 - 1067
  • [2] Mining frequent closed patterns in microarray data
    Cong, G
    Tan, KL
    Tung, AKH
    Pan, F
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 363 - 366
  • [3] An Integrated Framework for Relational and Hierarchical Mining of Frequent Closed Patterns
    Kumar, B. Pravin
    Divakar, V.
    Vinoth, E.
    Senthilkumar, Radha
    [J]. CONTEMPORARY COMPUTING, PROCEEDINGS, 2009, 40 : 115 - 126
  • [4] Mining frequent closed patterns with item constraints in data streams
    Hu, Wei-Cheng
    Wang, Ben-Nian
    Cheng, Zhuan-Liu
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 274 - 280
  • [5] Efficient algorithms for mining frequent and closed patterns from semi-structured data
    Arimura, Hiroki
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 2 - +
  • [6] Mining Top-K Frequent Closed Patterns from Gene Expression Data
    Ji, Shufan
    Wang, Xuejiao
    Zong, Yi
    Gao, Xiaopeng
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 732 - 739
  • [7] Effective algorithm for mining compressed frequent patterns
    School of Software, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
    不详
    [J]. Beijing Hangkong Hangtian Daxue Xuebao, 2009, 5 (640-643):
  • [8] A Compact Data Structure Based Technique for Mining Frequent Closed Item Sets
    Ahuja, Kamlesh
    Mishra, Durgesh Kumar
    Jain, Sarika
    [J]. SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 503 - 508
  • [9] Top-down Mining Frequent Closed Patterns in Microarray Data
    Shi JianJun
    Miao YuQing
    Zhang WanZhen
    [J]. THIRD INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING, 2009, : 851 - 854
  • [10] Mining frequent patterns from XML data
    Win, Chit Nilar
    Hla, Khin Haymar Saw
    [J]. APSITT 2005: 6th Asia-Pacific Symposium on Information and Telecommunication Technologies, Proceedings, 2005, : 208 - 212