An efficient approach based on selective partitioning for maximal frequent itemsets mining

被引:8
|
作者
Bai, Anita [1 ,2 ]
Dhabu, Meera [1 ]
Jagtap, Viraj [3 ]
Deshpande, Parag S. [1 ]
机构
[1] Visvesvaraya Natl Inst Technol, Dept Comp Sci & Engn, Nagpur 440010, Maharashtra, India
[2] Bharat Inst Engn & Technol, Dept Comp Sci & Engn, Hyderabad 501510, India
[3] Amazon, Hyderabad, India
关键词
Data mining; itemset-count tree; maximal frequent itemsets; partitions; transactional databases; ASSOCIATION RULES; ALGORITHMS; DISCOVERY;
D O I
10.1007/s12046-019-1158-1
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
We present a maximal frequent itemset (MFI) mining algorithm based on selective partitioning called SelPMiner. It makes use of a novel data format named Itemset-count tree-a compact and optimized representation in the form of partition that reduces memory requirement. It also does selective partitioning of the database, which reduces runtime to scan database. As the algorithm progressively searches for longer frequent itemsets in a depth-first manner, it creates new partitions with even smaller sizes having less dimensions and unique data instances, which results in faster support counting. SelPMiner uses a number of optimizations to prune the search space. We also prove upper bounds on the amount of memory consumed by these partitions. Experimental comparisons of the SelPMiner algorithm with popular existing fastest MFI mining algorithms on different types of datasets show significant speedup in computation time for many cases. SelPMiner works especially well when the minimum support is low and consumes less memory.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] An efficient approach based on selective partitioning for maximal frequent itemsets mining
    Anita Bai
    Meera Dhabu
    Viraj Jagtap
    Parag S Deshpande
    [J]. Sādhanā, 2019, 44
  • [2] Mining maximal frequent itemsets by a boolean based approach
    Salleb, A
    Maazouzi, Z
    Vrain, C
    [J]. ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 385 - 389
  • [3] GeneticMax: An Efficient Approach to Mining Maximal Frequent Itemsets Based on Genetic Algorithms
    Kabir, Mir Md. Jahangir
    Xu, Shuxiang
    Kang, Byeong Ho
    Zhao, Zongyuan
    [J]. INFORMATION TECHNOLOGY IN INDUSTRY, 2015, 3 (03): : 64 - 73
  • [4] An efficient maximal frequent itemsets mining algorithm - Based on frequent pattern tree
    Xue, XR
    Wang, GY
    Wu, Y
    Yang, SX
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 1 : 176 - 181
  • [5] GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
    Karam Gouda
    Mohammed J. Zaki
    [J]. Data Mining and Knowledge Discovery, 2005, 11 : 223 - 242
  • [6] GenMax: An efficient algorithm for mining maximal frequent itemsets
    Gouda, K
    Zaki, MJ
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) : 223 - 242
  • [7] Improvements in the data partitioning approach for frequent itemsets mining
    Nguyen, SN
    Orlowska, ME
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 625 - 633
  • [8] Efficient Mining of Maximal Frequent Itemsets Based on M-Step Lookahead
    Meyer, Elijah L.
    Chung, Soon M.
    [J]. PROCEEDINGS OF 2018 5TH INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2018,
  • [9] AN EFFICIENT ALGORITHM BASED ON TIME DECAY MODEL FOR MINING MAXIMAL FREQUENT ITEMSETS
    Huang, Guo-Yan
    Wang, Li-Bo
    Hu, Chang-Zhen
    Ren, Jia-Dong
    He, Hui-Ling
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 2063 - +
  • [10] An efficient approach for interactive mining of frequent itemsets
    Deng, ZH
    Li, X
    Tang, SW
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 138 - 149