TFP: An efficient algorithm for mining top-K frequent closed itemsets

被引:125
|
作者
Wang, JY [1 ]
Han, JW
Lu, Y
Tzvetkov, P
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Siebel Ctr Sci 2132, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
data mining; frequent itemset; association rules; mining methods and algorithms;
D O I
10.1109/TKDE.2005.81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent itemset mining has been studied extensively in literature. Most previous studies require the specification of a min_support threshold and aim at mining a complete set of frequent itemsets satisfying min_support. However, in practice, it is difficult for users to provide an appropriate min_support threshold. In addition, a complete set of frequent itemsets is much less compact than a set of frequent closed itemsets. In this paper, we propose an alternative mining task: mining top-k frequent closed itemsets of length no less than min_l, where k is the desired number of frequent closed itemsets to be mined, and min_l is the minimal length of each itemset. An efficient algorithm, called TFP, is developed for mining such itemsets without mins_support. Starting at min_support = 0 and by making use of the length constraint and the properties of top-k frequent closed itemsets, min_support can be raised effectively and FP-Tree can be pruned dynamically both during and after the construction of the tree using our two proposed methods: the closed node count and descendant_sum. Moreover, mining is further speeded up by employing a top-down and bottom-up combined FP-Tree traversing strategy, a set of search space pruning methods, a fast 2-level hash-indexed result tree, and a novel closed itemset verification scheme. Our extensive performance study shows that TFP has high performance and linear scalability in terms of the database size.
引用
收藏
页码:652 / 664
页数:13
相关论文
共 50 条
  • [21] Mining top-k frequent closed itemsets over data streams using the sliding window model
    Tsai, Pauray S. M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (10) : 6968 - 6973
  • [22] Efficient Algorithms for Mining Top-K High Utility Itemsets
    Tseng, Vincent S.
    Wu, Cheng-Wei
    Fournier-Viger, Philippe
    Yu, Philip S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (01) : 54 - 67
  • [23] An Algorithm for Mining Frequent Closed Itemsets
    Zhang Tiejun
    Yang Junrui
    Wang Xiuqin
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 240 - +
  • [24] An Algorithm of Mining Closed Frequent Itemsets
    Li, Haifeng
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 95 - 98
  • [25] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [26] A method for mining top-rank-k frequent closed itemsets
    Nguyen, Loan T. T.
    Trinh, Truc
    Ngoc-Thanh Nguyen
    Vo, Bay
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (02) : 1297 - 1305
  • [27] Research of Top-N Frequent Closed Itemsets Mining Algorithm
    Liu, Lizhi
    Liu, Jun
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 1139 - 1143
  • [28] IFCIA: An efficient algorithm for mining intertransaction frequent closed itemsets
    Dong, Jie
    Han, Min
    FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 678 - +
  • [29] TKG: Efficient Mining of Top-K Frequent Subgraphs
    Fournier-Viger, Philippe
    Cheng, Chao
    Lin, Jerry Chun-Wei
    Yun, Unil
    Kiran, R. Uday
    BIG DATA ANALYTICS (BDA 2019), 2019, 11932 : 209 - 226
  • [30] Parallel mining of top-k frequent itemsets in very large text database
    Wang, YH
    Jia, Y
    Yang, SQ
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 706 - 712