A data mining proxy approach for efficient frequent itemset mining

被引:2
|
作者
Yu, Jeffrey Xu [1 ]
Li, Zhiheng [1 ]
Liu, Guimei [2 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[2] Natl Univ Singapore, Singapore 117548, Singapore
来源
VLDB JOURNAL | 2008年 / 17卷 / 04期
关键词
Data Mining; Association Rule; Frequent Pattern; Minimum Support; Frequent Itemset;
D O I
10.1007/s00778-007-0047-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper, we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns, because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently. We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed.
引用
收藏
页码:947 / 970
页数:24
相关论文
共 50 条
  • [31] A novel algorithm for frequent itemset mining in data warehouses
    徐利军
    谢康林
    Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2006, (02) : 216 - 224
  • [32] Parallel Incremental Frequent Itemset Mining for Large Data
    Yu-Geng Song
    Hui-Min Cui
    Xiao-Bing Feng
    Journal of Computer Science and Technology, 2017, 32 : 368 - 385
  • [33] Recommendation using Frequent Itemset Mining in Big Data
    Kunjachan, Honeytta
    Hareesh, M. J.
    Sreedevi, K. M.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 561 - 566
  • [34] New approach in Big Data Mining for frequent itemset using mapreduce in HDFS
    Nikam, Pallavi V.
    Deshpande, Deepa S.
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [35] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
    Kulkarni, Prajakta G.
    Khonde, Shraddha R.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
  • [36] An efficient polynomial delay algorithm for pseudo frequent itemset mining
    Uno, Takeaki
    Arimura, Hiroki
    DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 219 - +
  • [37] Efficient weighted probabilistic frequent itemset mining in uncertain databases
    Li, Zhiyang
    Chen, Fengjuan
    Wu, Junfeng
    Liu, Zhaobin
    Liu, Weijiang
    EXPERT SYSTEMS, 2021, 38 (05)
  • [38] AN EFFICIENT ITEMSET REPRESENTATION FOR MINING FREQUENT PATTERNS IN TRANSACTIONAL DATABASES
    Tomovic, Savo
    Stanisic, Predrag
    COMPUTING AND INFORMATICS, 2018, 37 (04) : 894 - 914
  • [39] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
    Al-Bana, Mohamed Reda
    Farhan, Marwa Salah
    Othman, Nermin Abdelhakim
    DATA, 2022, 7 (01)
  • [40] An Efficient Frequent Itemset Mining Method over High-speed Data Streams
    Memar, Mina
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    Fakhrahmad, Seyyed Mostafa
    COMPUTER JOURNAL, 2012, 55 (11): : 1357 - 1366