PNPFI: An Efficient Parallel Frequent Itemsets Mining Algorithm

被引:0
|
作者
Zhang, Fang [1 ]
Zhang, Yu [1 ]
Liao, Xiaofei [1 ]
Jin, Hai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Cluster & Grid Comp Lab, Serv Comp Technol & Syst Lab, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Frequent itemsets mining; N-list; P-Subsume; Load balancing strategy;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Frequent itemsets mining (FIM) plays an important role in many data mining areas. With the explosion of data scale, a number of parallel FIM algorithms have been proposed. Although existing solutions have outstanding scalability, they suffer from high consumption of CPU and memory for recursively mining frequent itemsets based on a tree-structure. In this paper, we propose a novel parallel algorithm, named PNPFI. It employs three novel key optimizations. In detail, the itemsets are stored by the N-list structure, which is more compact than existing tree-based structure. It uses a new structure, called P-Subsume, to generate some frequent itemsets without the process of N-list intersection. In addition, PNPFI proposes a new load balancing strategy, which intelligently divides a large-scale FIM problem into a set of tasks based on the profiled load of each item. Compared with the state-of-the-art algorithms, experimental results show that PNPFI gets a performance improvement of 39% on average (max to 79%), and reduces the memory usage by 58% on average (max to 90%).
引用
收藏
页码:172 / 177
页数:6
相关论文
共 50 条
  • [1] Parallel algorithm for mining frequent itemsets
    Ruan, YL
    Liu, G
    Li, QH
    [J]. Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 2118 - 2121
  • [2] A fast parallel algorithm for frequent itemsets mining
    Souliou, Dora
    Pagourtzis, Aris
    Tsanakas, Panayiotis
    [J]. ARTIFICIAL INTELLIGENCE AND INNOVATIONS 2007: FROM THEORY TO APPLICATIONS, 2007, : 213 - +
  • [3] A parallel Apriori algorithm for frequent itemsets mining
    Ye, Yanbin
    Chiang, Chia-Chu
    [J]. FOURTH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS, PROCEEDINGS, 2006, : 87 - +
  • [4] MREclat: an Algorithm for Parallel Mining Frequent Itemsets
    Zhang, Zhigang
    Ji, Genlin
    Tang, Mengmeng
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 177 - 180
  • [5] An Efficient Algorithm for Mining Frequent Closed Itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    [J]. INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2015, 39 (01): : 87 - 98
  • [6] BitTableFI: An efficient mining frequent itemsets algorithm
    Dong, Jie
    Han, Min
    [J]. KNOWLEDGE-BASED SYSTEMS, 2007, 20 (04) : 329 - 335
  • [7] GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
    Karam Gouda
    Mohammed J. Zaki
    [J]. Data Mining and Knowledge Discovery, 2005, 11 : 223 - 242
  • [8] An efficient algorithm for incrementally mining frequent closed itemsets
    Yen, Show-Jane
    Lee, Yue-Shi
    Wang, Chiu-Kuang
    [J]. APPLIED INTELLIGENCE, 2014, 40 (04) : 649 - 668
  • [9] An Efficient Algorithm for Mining Frequent Itemsets with Single Constraint
    Hai Duong
    Tin Truong
    Bac Le
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 367 - 378
  • [10] negFIN: An efficient algorithm for fast mining frequent itemsets
    Aryabarzan, Nader
    Minaei-Bidgoli, Behrouz
    Teshnehlab, Mohammad
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 105 : 129 - 143