Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

被引:4
|
作者
Wu, Cheng-Wei [1 ]
Huang, JianTao [1 ]
Lin, Yun-Wei [1 ]
Chuang, Chien-Yu [1 ]
Tseng, Yu-Chee [2 ]
机构
[1] Natl Ilan Univ, Yilan, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Yilan, Taiwan
关键词
Frequent itemset mining; Frequent closed itemset mining; Lossless and condensed representation; Deriving algorithms;
D O I
10.1007/s10489-020-02172-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When mining frequent itemsets (abbr. FIs) from dense datasets, it usually produces too many itemsets and results in the mining task to suffer from a very long execution time and high memory consumption. Frequent closed itemset (abbr. FCI) is a compact and lossless representation of FI. Mining FCIs can not only reduce the execution time and memory usage, but also reserve the complete information of FIs derived from FCIs. Although many studies have been proposed with various efficient methods for mining FCIs, few of them have developed algorithms for efficiently deriving FIs from FCIs. In this work, we propose two efficient algorithms named DFI-List and DFI-Growth for efficiently deriving FIs from FCIs. The both algorithms adopt depth-first search and divide-and-conquer methodology to derive all the FIs. DFI-List efficiently derives all the FIs with a vertical index structure called Cid List. DFI-Growth compresses the information of FCIs into tree structures and applies pattern-growth strategy to derive FIs from the trees. Empirical experiments show that DFI-List is the most efficient and scalable algorithm on the dense datasets. For example, when the minimum support threshold is set to 50% on the Chess dataset, DFI-List runs faster than LevelWise (Pasquier et al. Inf Syst 24(1): 25-46, 1999b) over 100 times. As for DFI-Growth, it is the most stable and memory efficient algorithm on the sparse datasets. Both DFI-Growth and DFI-List are superior to the state-of-the-art algorithm (Pasquier et al. Inf Syst 24(1): 25-46, 199b) in terms of execution time.
引用
下载
收藏
页码:7002 / 7023
页数:22
相关论文
共 50 条
  • [1] Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets
    Cheng-Wei Wu
    JianTao Huang
    Yun-Wei Lin
    Chien-Yu Chuang
    Yu-Chee Tseng
    Applied Intelligence, 2022, 52 : 7002 - 7023
  • [2] NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets
    Huy Quang Pham
    Duc Tran
    Ninh Bao Duong
    Fournier-Viger, Philippe
    Alioune Ngom
    INFORMATION TECHNOLOGY IN INDUSTRY, 2019, 7 (02): : 1 - 13
  • [3] Efficient mining frequent itemsets algorithms
    Marghny H. Mohamed
    Mohammed M. Darwieesh
    International Journal of Machine Learning and Cybernetics, 2014, 5 : 823 - 833
  • [4] Efficient mining frequent itemsets algorithms
    Mohamed, Marghny H.
    Darwieesh, Mohammed M.
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2014, 5 (06) : 823 - 833
  • [5] An efficient algorithm for mining frequent closed itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    Informatica (Slovenia), 2015, 39 (01): : 87 - 98
  • [6] An Efficient Algorithm for Mining Frequent Closed Itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2015, 39 (01): : 87 - 98
  • [7] Efficient algorithms of mining top-k frequent closed itemsets
    Lan Yongjie
    Qiu Yong
    ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 551 - 554
  • [8] δ-tolerance closed frequent itemsets
    Cheng, James
    Ke, Yiping
    Ng, Wilfred
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 139 - +
  • [9] An Efficient Algorithm for Deriving Frequent Itemsets from Lossless Condensed Representation
    Huang, JianTao
    Lai, Yi-Pei
    Lo, Chieh
    Wu, Cheng-Wei
    ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: FROM THEORY TO PRACTICE, 2019, 11606 : 216 - 229
  • [10] An Efficient Mining Model for Global Frequent Closed Itemsets
    Lin, Jianming
    Ju, Chunhua
    Liu, Dongsheng
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 278 - 282