A data mining proxy approach for efficient frequent itemset mining

被引：2

作者：

Yu, Jeffrey Xu ^{[1
]}

Li, Zhiheng ^{[1
]}

Liu, Guimei ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China

[2] Natl Univ Singapore, Singapore 117548, Singapore

来源：

VLDB JOURNAL | 2008年 / 17卷 / 04期

关键词：

Data Mining; Association Rule; Frequent Pattern; Minimum Support; Frequent Itemset;

D O I：

10.1007/s00778-007-0047-0

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.

引用

页码：947 / 970

页数：24

共 50 条

[31] A novel algorithm for frequent itemset mining in data warehouses
徐利军
谢康林
Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2006, (02) : 216 - 224
[32] Parallel Incremental Frequent Itemset Mining for Large Data
Yu-Geng Song
Hui-Min Cui
Xiao-Bing Feng
Journal of Computer Science and Technology, 2017, 32 : 368 - 385
[33] Recommendation using Frequent Itemset Mining in Big Data
Kunjachan, Honeytta
Hareesh, M. J.
Sreedevi, K. M.
PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 561 - 566
[34] New approach in Big Data Mining for frequent itemset using mapreduce in HDFS
Nikam, Pallavi V.
Deshpande, Deepa S.
2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
[35] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
Kulkarni, Prajakta G.
Khonde, Shraddha R.
2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
[36] An efficient polynomial delay algorithm for pseudo frequent itemset mining
Uno, Takeaki
Arimura, Hiroki
DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 219 - +
[37] Efficient weighted probabilistic frequent itemset mining in uncertain databases
Li, Zhiyang
Chen, Fengjuan
Wu, Junfeng
Liu, Zhaobin
Liu, Weijiang
EXPERT SYSTEMS, 2021, 38 (05)
[38] AN EFFICIENT ITEMSET REPRESENTATION FOR MINING FREQUENT PATTERNS IN TRANSACTIONAL DATABASES
Tomovic, Savo
Stanisic, Predrag
COMPUTING AND INFORMATICS, 2018, 37 (04) : 894 - 914
[39] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
Al-Bana, Mohamed Reda
Farhan, Marwa Salah
Othman, Nermin Abdelhakim
DATA, 2022, 7 (01)
[40] An Efficient Frequent Itemset Mining Method over High-speed Data Streams
Memar, Mina
Deypir, Mahmood
Sadreddini, Mohammad Hadi
Fakhrahmad, Seyyed Mostafa
COMPUTER JOURNAL, 2012, 55 (11): : 1357 - 1366

← 1 2 3 4 5 →