Fast algorithms for frequent itemset mining using FP-trees

被引:336
|
作者
Grahne, G [1 ]
Zhu, JF [1 ]
机构
[1] Concordia Univ, Dept Comp Sci, Montreal, PQ H3G 1M8, Canada
关键词
data mining; association rules;
D O I
10.1109/TKDE.2005.166
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. Methods for mining frequent itemsets have been implemented using a prefix-tree structure, known as an FP-tree, for storing compressed information about frequent itemsets. Numerous experimental results have demonstrated that these algorithms perform extremely well. In this paper, we present a novel FP-array technique that greatly reduces the need to traverse FP-trees, thus obtaining significantly improved performance for FP-tree-based algorithms. Our technique works especially well for sparse data sets. Furthermore, we present new algorithms for mining all, maximal, and closed frequent itemsets. Our algorithms use the FP-tree data structure in combination with the FP-array technique efficiently and incorporate various optimization techniques. We also present experimental results comparing our methods with existing algorithms. The results show that our methods are the fastest for many cases. Even though the algorithms consume much memory when the data sets are sparse, they are still the fastest ones when the minimum support is low. Moreover, they are always among the fastest algorithms and consume less memory than other methods when the data sets are dense.
引用
收藏
页码:1347 / 1362
页数:16
相关论文
共 50 条
  • [21] FP-bonsai: The art of growing and pruning small FP-trees
    Bonchi, F
    Goethals, B
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2004, 3056 : 155 - 160
  • [22] Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining
    Afuan, Lasmedi
    Ashari, Ahmad
    Suyanto, Yohanes
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (02) : 263 - 267
  • [23] Inverted Index Automata Frequent Itemset Mining for Large Dataset Frequent Itemset Mining
    Dai, Xin
    Hamed, Haza Nuzly Abdull
    Su, Qichen
    Hao, Xue
    IEEE ACCESS, 2024, 12 : 195111 - 195130
  • [24] EFFICIENT MODIFICATION OF FAST UPDATED FP-TREES BASED ON PRE-LARGE CONCEPTS
    Lin, Chun-Wei
    Hong, Tzung-Pei
    Lu, Wen-Hsiang
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (11): : 5163 - 5177
  • [25] Building FP-Tree on the Fly: Single-Pass Frequent Itemset Mining
    Shahbazi, Nima
    Soltani, Rohollah
    Gryz, Jarek
    An, Aijun
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 387 - 400
  • [26] Data heterogeneity's impact on the performance of frequent itemset mining algorithms
    Trasierras, Antonio Manuel
    Luna, Jose Maria
    Fournier-Viger, Philippe
    Ventura, Sebastian
    INFORMATION SCIENCES, 2024, 678
  • [27] Frequent Itemset Mining on Hadoop
    Ferenc Kovacs
    Illes, Janos
    IEEE 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL CYBERNETICS (ICCC 2013), 2013, : 241 - 245
  • [28] On A Visual Frequent Itemset Mining
    Lim, SeungJin
    2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, 2009, : 25 - 30
  • [29] An Incremental Interesting Maximal Frequent Itemset Mining Based on FP-Growth Algorithm
    Alsaeedi, Hussein A.
    Alhegami, Ahmed S.
    COMPLEXITY, 2022, 2022
  • [30] Efficient Apriori Based Algorithms for Privacy Preserving Frequent Itemset Mining
    Csiszarik, Adrian
    Lestyan, Szilvia
    Lukacs, Andras
    2014 5TH IEEE CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2014, : 431 - 435