A data mining proxy approach for efficient frequent itemset mining

被引:2
|
作者
Yu, Jeffrey Xu [1 ]
Li, Zhiheng [1 ]
Liu, Guimei [2 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[2] Natl Univ Singapore, Singapore 117548, Singapore
来源
VLDB JOURNAL | 2008年 / 17卷 / 04期
关键词
Data Mining; Association Rule; Frequent Pattern; Minimum Support; Frequent Itemset;
D O I
10.1007/s00778-007-0047-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.
引用
下载
收藏
页码:947 / 970
页数:24
相关论文
共 50 条
  • [1] A data mining proxy approach for efficient frequent itemset mining
    Jeffrey Xu Yu
    Zhiheng Li
    Guimei Liu
    The VLDB Journal, 2008, 17 : 947 - 970
  • [2] Data mining proxy: Serving large number of users for efficient frequent itemset mining
    Li, ZH
    Yu, JX
    Lu, HJ
    Xu, YB
    Liu, GM
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2004, 3056 : 458 - 463
  • [3] An efficient algorithm for frequent itemset mining on data streams
    Xie Zhi-Jun
    Chen Hong
    Li, Cuiping
    ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 474 - 491
  • [4] An efficient frequent itemset mining algorithm
    Luo, Ke
    Zhang, Xue-Mao
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 756 - 761
  • [5] BISC: A Bitmap Itemset Support Counting Approach for Efficient Frequent Itemset Mining
    Chen, Jinlin
    Xiao, Keli
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
  • [6] Efficient Incremental Itemset Tree for Approximate Frequent Itemset Mining On Data Stream
    Bai, Pavitra S.
    Kumar, Ravi G. K.
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2016, : 239 - 242
  • [7] An Efficient Itemset Mining Approach for Data Streams
    Baralis, Elena
    Cerquitelli, Tania
    Chiusano, Silvia
    Grand, Alberto
    Grimaudo, Luigi
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 515 - 523
  • [8] Efficient Frequent Itemset Mining from Dense Data Streams
    Cuzzocrea, Alfredo
    Jiang, Fan
    Lee, Wookey
    Leung, Carson K.
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 593 - 601
  • [9] An approximate approach to frequent itemset mining
    Zhang, Chunkai
    Zhang, Xudong
    Tian, Panbo
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 68 - 73
  • [10] Frequent Itemset Mining for Big Data
    Moens, Sandy
    Aksehirli, Emin
    Goethals, Bart
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,