An efficient approach based on selective partitioning for maximal frequent itemsets mining

被引：8

作者：

Bai, Anita ^{[1
,2
]}

Dhabu, Meera ^{[1
]}

Jagtap, Viraj ^{[3
]}

Deshpande, Parag S. ^{[1
]}

机构：

[1] Visvesvaraya Natl Inst Technol, Dept Comp Sci & Engn, Nagpur 440010, Maharashtra, India

[2] Bharat Inst Engn & Technol, Dept Comp Sci & Engn, Hyderabad 501510, India

[3] Amazon, Hyderabad, India

来源：

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2019年 / 44卷 / 08期

关键词：

Data mining; itemset-count tree; maximal frequent itemsets; partitions; transactional databases; ASSOCIATION RULES; ALGORITHMS; DISCOVERY;

D O I：

10.1007/s12046-019-1158-1

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

We present a maximal frequent itemset (MFI) mining algorithm based on selective partitioning called SelPMiner. It makes use of a novel data format named Itemset-count tree-a compact and optimized representation in the form of partition that reduces memory requirement. It also does selective partitioning of the database, which reduces runtime to scan database. As the algorithm progressively searches for longer frequent itemsets in a depth-first manner, it creates new partitions with even smaller sizes having less dimensions and unique data instances, which results in faster support counting. SelPMiner uses a number of optimizations to prune the search space. We also prove upper bounds on the amount of memory consumed by these partitions. Experimental comparisons of the SelPMiner algorithm with popular existing fastest MFI mining algorithms on different types of datasets show significant speedup in computation time for many cases. SelPMiner works especially well when the minimum support is low and consumes less memory.

引用

页数：22

共 50 条

[1] An efficient approach based on selective partitioning for maximal frequent itemsets mining
Anita Bai
Meera Dhabu
Viraj Jagtap
Parag S Deshpande
[J]. Sādhanā, 2019, 44
[2] Mining maximal frequent itemsets by a boolean based approach
Salleb, A
Maazouzi, Z
Vrain, C
[J]. ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 385 - 389
[3] GeneticMax: An Efficient Approach to Mining Maximal Frequent Itemsets Based on Genetic Algorithms
Kabir, Mir Md. Jahangir
Xu, Shuxiang
Kang, Byeong Ho
Zhao, Zongyuan
[J]. INFORMATION TECHNOLOGY IN INDUSTRY, 2015, 3 (03): : 64 - 73
[4] An efficient maximal frequent itemsets mining algorithm - Based on frequent pattern tree
Xue, XR
Wang, GY
Wu, Y
Yang, SX
[J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 1 : 176 - 181
[5] GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
Karam Gouda
Mohammed J. Zaki
[J]. Data Mining and Knowledge Discovery, 2005, 11 : 223 - 242
[6] GenMax: An efficient algorithm for mining maximal frequent itemsets
Gouda, K
Zaki, MJ
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) : 223 - 242
[7] Improvements in the data partitioning approach for frequent itemsets mining
Nguyen, SN
Orlowska, ME
[J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 625 - 633
[8] Efficient Mining of Maximal Frequent Itemsets Based on M-Step Lookahead
Meyer, Elijah L.
Chung, Soon M.
[J]. PROCEEDINGS OF 2018 5TH INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2018,
[9] AN EFFICIENT ALGORITHM BASED ON TIME DECAY MODEL FOR MINING MAXIMAL FREQUENT ITEMSETS
Huang, Guo-Yan
Wang, Li-Bo
Hu, Chang-Zhen
Ren, Jia-Dong
He, Hui-Ling
[J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 2063 - +
[10] An efficient approach for interactive mining of frequent itemsets
Deng, ZH
Li, X
Tang, SW
[J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 138 - 149

← 1 2 3 4 5 →