A data mining proxy approach for efficient frequent itemset mining

被引：2

作者：

Yu, Jeffrey Xu ^{[1
]}

Li, Zhiheng ^{[1
]}

Liu, Guimei ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China

[2] Natl Univ Singapore, Singapore 117548, Singapore

来源：

VLDB JOURNAL | 2008年 / 17卷 / 04期

关键词：

Data Mining; Association Rule; Frequent Pattern; Minimum Support; Frequent Itemset;

D O I：

10.1007/s00778-007-0047-0

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.

引用

下载

页码：947 / 970

页数：24

共 50 条

[1] A data mining proxy approach for efficient frequent itemset mining
Jeffrey Xu Yu
Zhiheng Li
Guimei Liu
The VLDB Journal, 2008, 17 : 947 - 970
[2] Data mining proxy: Serving large number of users for efficient frequent itemset mining
Li, ZH
Yu, JX
Lu, HJ
Xu, YB
Liu, GM
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2004, 3056 : 458 - 463
[3] An efficient algorithm for frequent itemset mining on data streams
Xie Zhi-Jun
Chen Hong
Li, Cuiping
ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 474 - 491
[4] An efficient frequent itemset mining algorithm
Luo, Ke
Zhang, Xue-Mao
PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 756 - 761
[5] BISC: A Bitmap Itemset Support Counting Approach for Efficient Frequent Itemset Mining
Chen, Jinlin
Xiao, Keli
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
[6] Efficient Incremental Itemset Tree for Approximate Frequent Itemset Mining On Data Stream
Bai, Pavitra S.
Kumar, Ravi G. K.
PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2016, : 239 - 242
[7] An Efficient Itemset Mining Approach for Data Streams
Baralis, Elena
Cerquitelli, Tania
Chiusano, Silvia
Grand, Alberto
Grimaudo, Luigi
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 515 - 523
[8] Efficient Frequent Itemset Mining from Dense Data Streams
Cuzzocrea, Alfredo
Jiang, Fan
Lee, Wookey
Leung, Carson K.
WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 593 - 601
[9] An approximate approach to frequent itemset mining
Zhang, Chunkai
Zhang, Xudong
Tian, Panbo
2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 68 - 73
[10] Frequent Itemset Mining for Big Data
Moens, Sandy
Aksehirli, Emin
Goethals, Bart
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,

← 1 2 3 4 5 →