OPTIMIZATION AND REALIZATION OF PARALLEL FREQUENT ITEM SET MINING ALGORITHM

被引:0
|
作者
Yuan, Ling [1 ]
Li, Dan [1 ]
Chen, Yuzhong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci, Wuhan 430074, Peoples R China
来源
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP) | 2016年
关键词
Data Mining; Frequent item sets; Candidate Item Sets; Key-value pairs;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Associative data mining is the research hotspot in the field of big data, and frequent item sets mining is an important step in the analysis of associative data. This paper focuses on analyzing the frequent item sets mining algorithm based on Apriori parallel algorithm. The paper has found two shortages of Apriori parallel algorithm: one is that the key value pair are too many, another is that in the combiner stage, it occupies two much memory. Therefore, we propose an optimized algorithm. In the optimization algorithm, candidate item sets and local count information are saved in memory, greatly reducing the number of generated keys. Meanwhile, in the short length frequent item sets mining, the method of reducing the number of scanning transaction data without generating candidate item sets can improve the algorithm efficiency. We do the experiments in the Hadoop platform to testify the performance of the proposed optimized algorithm. The experiments demonstrate that the time and I/O of the optimized algorithm have been improved greatly, compared with the non-optimized algorithm.
引用
收藏
页码:546 / 551
页数:6
相关论文
共 50 条
  • [21] AFARTICA: A Frequent Item-Set Mining Method Using Artificial Cell Division Algorithm
    Paladhi, Saubhik
    Chatterjee, Sankhadeep
    Goto, Takaaki
    Sen, Soumya
    JOURNAL OF DATABASE MANAGEMENT, 2019, 30 (03) : 71 - 93
  • [22] A vertical format algorithm for mining frequent item sets
    Guo Yi-ming
    Wang Zhi-jun
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 4, 2010, : 11 - 13
  • [23] A new mining algorithm based on frequent item sets
    Wen Yun
    FIRST INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, : 410 - 413
  • [25] A Generalized Parallel Algorithm for Frequent Itemset Mining
    Craus, Mitica
    Archip, Alexandru
    PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 520 - +
  • [26] A fast parallel algorithm for frequent itemsets mining
    Souliou, Dora
    Pagourtzis, Aris
    Tsanakas, Panayiotis
    ARTIFICIAL INTELLIGENCE AND INNOVATIONS 2007: FROM THEORY TO APPLICATIONS, 2007, : 213 - +
  • [27] A parallel Apriori algorithm for frequent itemsets mining
    Ye, Yanbin
    Chiang, Chia-Chu
    FOURTH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS, PROCEEDINGS, 2006, : 87 - +
  • [28] A Parallel Algorithm for Mining Maximal Frequent Subgraphs
    El Radie, Eihab
    Salem, Saeed
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1965 - 1971
  • [29] Parallel algorithm for mining maximal frequent patterns
    Wang, H
    Xiao, ZT
    Zhang, HJ
    Jiang, SY
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, PROCEEDINGS, 2003, 2834 : 241 - 248
  • [30] A Highly Parallel Algorithm for Frequent Itemset Mining
    Mesa, Alejandro
    Feregrino-Uribe, Claudia
    Cumplido, Rene
    Hernandez-Palancar, Jose
    ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 291 - +