A data mining proxy approach for efficient frequent itemset mining

被引：2

作者：

Yu, Jeffrey Xu ^{[1
]}

Li, Zhiheng ^{[1
]}

Liu, Guimei ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China

[2] Natl Univ Singapore, Singapore 117548, Singapore

来源：

VLDB JOURNAL | 2008年 / 17卷 / 04期

关键词：

Data Mining; Association Rule; Frequent Pattern; Minimum Support; Frequent Itemset;

D O I：

10.1007/s00778-007-0047-0

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.

引用

下载

页码：947 / 970

页数：24

共 50 条

[21] Anytime Frequent Itemset Mining of Transactional Data Streams
Goyal, Poonam
Challa, Jagat Sesh
Shrivastava, Shivin
Goyal, Navneet
BIG DATA RESEARCH, 2020, 21
[22] Novel algorithm for frequent itemset mining in data warehouses
Xu L.-J.
Xie K.-L.
Journal of Zhejiang University-SCIENCE A, 2006, 7 (2): : 216 - 224
[23] Parallel Incremental Frequent Itemset Mining for Large Data
Song, Yu-Geng
Cui, Hui-Min
Feng, Xiao-Bing
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (02) : 368 - 385
[24] A Survey on Closed Frequent Itemset Mining on Data Streams
Bai, Pavitra . S.
Kumar, Ravi . G. . K.
PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2016, : 542 - 547
[25] Frequent Itemset Mining in High Dimensional Data: A Review
Zaki, Fatimah Audah Md
Zulkurnain, Nurul Fariza
COMPUTATIONAL SCIENCE AND TECHNOLOGY, 2019, 481 : 325 - 334
[26] A Frequent and Rare Itemset Mining Approach to Transaction Clustering
Tummala, Kuladeep
Oswald, C.
Sivaselvan, B.
DATA SCIENCE ANALYTICS AND APPLICATIONS, DASAA 2017, 2018, 804 : 8 - 18
[27] Approximate Frequent Itemset Mining for Streaming Data on FPGA
Li, Yubin
Sun, Yuliang
Dai, Guohao
Xu, Qiang
Wang, Yu
Yang, Huazhong
2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
[28] A new approach for mining frequent K-itemset
Sankar, H. Ravi
Naidu, M. M.
WCECS 2007: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, 2007, : 718 - +
[29] PARASOL: a hybrid approximation approach for scalable frequent itemset mining in streaming data
Yamamoto, Yoshitaka
Tabei, Yasuo
Iwanuma, Koji
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 55 (01) : 119 - 147
[30] PARASOL: a hybrid approximation approach for scalable frequent itemset mining in streaming data
Yoshitaka Yamamoto
Yasuo Tabei
Koji Iwanuma
Journal of Intelligent Information Systems, 2020, 55 : 119 - 147

← 1 2 3 4 5 →