TFP: An efficient algorithm for mining top-K frequent closed itemsets

被引:124
|
作者
Wang, JY [1 ]
Han, JW
Lu, Y
Tzvetkov, P
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Siebel Ctr Sci 2132, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
data mining; frequent itemset; association rules; mining methods and algorithms;
D O I
10.1109/TKDE.2005.81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent itemset mining has been studied extensively in literature. Most previous studies require the specification of a min_support threshold and aim at mining a complete set of frequent itemsets satisfying min_support. However, in practice, it is difficult for users to provide an appropriate min_support threshold. In addition, a complete set of frequent itemsets is much less compact than a set of frequent closed itemsets. In this paper, we propose an alternative mining task: mining top-k frequent closed itemsets of length no less than min_l, where k is the desired number of frequent closed itemsets to be mined, and min_l is the minimal length of each itemset. An efficient algorithm, called TFP, is developed for mining such itemsets without mins_support. Starting at min_support = 0 and by making use of the length constraint and the properties of top-k frequent closed itemsets, min_support can be raised effectively and FP-Tree can be pruned dynamically both during and after the construction of the tree using our two proposed methods: the closed node count and descendant_sum. Moreover, mining is further speeded up by employing a top-down and bottom-up combined FP-Tree traversing strategy, a set of search space pruning methods, a fast 2-level hash-indexed result tree, and a novel closed itemset verification scheme. Our extensive performance study shows that TFP has high performance and linear scalability in terms of the database size.
引用
收藏
页码:652 / 664
页数:13
相关论文
共 50 条
  • [1] Efficient algorithms of mining top-k frequent closed itemsets
    Lan Yongjie
    Qiu Yong
    [J]. ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 551 - 554
  • [2] Efficient incremental mining of top-K frequent closed itemsets
    Pietracaprina, Andrea
    Vandin, Fabio
    [J]. DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 275 - +
  • [3] Interactive mining of top-K frequent closed itemsets from data streams
    Li, Hua-Fu
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (07) : 10779 - 10788
  • [4] One Database Pass Algorithms of Mining Top-k Frequent Closed Itemsets
    Qiu, Yong
    Lan, Yong-Jie
    [J]. ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 828 - 833
  • [5] An Efficient Algorithm for Mining Frequent Closed Itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2015, 39 (01): : 87 - 98
  • [6] TKEH: an efficient algorithm for mining top-k high utility itemsets
    Singh, Kuldeep
    Singh, Shashank Sheshar
    Kumar, Ajay
    Biswas, Bhaskar
    [J]. APPLIED INTELLIGENCE, 2019, 49 (03) : 1078 - 1097
  • [7] TKEH: an efficient algorithm for mining top-k high utility itemsets
    Kuldeep Singh
    Shashank Sheshar Singh
    Ajay Kumar
    Bhaskar Biswas
    [J]. Applied Intelligence, 2019, 49 : 1078 - 1097
  • [8] DEVELOPMENT OF AN EFFICIENT TECHNIQUE FOR MINING TOP-K CLOSED HIGH UTILITY ITEMSETS
    Velayudhan, Baby
    Sakthivel
    Subasree
    [J]. IIOAB JOURNAL, 2016, 7 (09) : 150 - 155
  • [9] Mining top-K frequent itemsets through progressive sampling
    Andrea Pietracaprina
    Matteo Riondato
    Eli Upfal
    Fabio Vandin
    [J]. Data Mining and Knowledge Discovery, 2010, 21 : 310 - 326
  • [10] ExMiner: An efficient algorithm for mining top-k frequent patterns
    Quang, Tran Minh
    Oyanagi, Shigeru
    Yamazaki, Katsuhiro
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 436 - 447