OPTIMIZATION AND REALIZATION OF PARALLEL FREQUENT ITEM SET MINING ALGORITHM

被引:0
|
作者
Yuan, Ling [1 ]
Li, Dan [1 ]
Chen, Yuzhong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci, Wuhan 430074, Peoples R China
关键词
Data Mining; Frequent item sets; Candidate Item Sets; Key-value pairs;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Associative data mining is the research hotspot in the field of big data, and frequent item sets mining is an important step in the analysis of associative data. This paper focuses on analyzing the frequent item sets mining algorithm based on Apriori parallel algorithm. The paper has found two shortages of Apriori parallel algorithm: one is that the key value pair are too many, another is that in the combiner stage, it occupies two much memory. Therefore, we propose an optimized algorithm. In the optimization algorithm, candidate item sets and local count information are saved in memory, greatly reducing the number of generated keys. Meanwhile, in the short length frequent item sets mining, the method of reducing the number of scanning transaction data without generating candidate item sets can improve the algorithm efficiency. We do the experiments in the Hadoop platform to testify the performance of the proposed optimized algorithm. The experiments demonstrate that the time and I/O of the optimized algorithm have been improved greatly, compared with the non-optimized algorithm.
引用
收藏
页码:546 / 551
页数:6
相关论文
共 50 条
  • [1] Parallel Frequent Item Set Mining with Selective Item Replication
    Ozkural, Eray
    Ucar, Bora
    Aykanat, Cevdet
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (10) : 1632 - 1640
  • [2] Optimization of frequent item set mining parallelization algorithm based on spark platform
    Deng, Fan
    Wang, Jiabin
    Lv, Sheng
    DISCOVER COMPUTING, 2024, 27 (01)
  • [3] Frequent item set mining
    Borgelt, Christian
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (06) : 437 - 456
  • [4] Frequent Item Set Mining Algorithm Based on Bit Combination
    Lu, Jun
    Zhao, Renpeng
    Zhou, Kailong
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 72 - 76
  • [5] Parallel algorithm for mining frequent item sets based on Spark
    Mao Y.
    Wu B.
    Xu C.
    Zhang M.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (04): : 1267 - 1283
  • [6] SaM: A Split and Merge Algorithm for Fuzzy Frequent Item Set Mining
    Borgelt, Christian
    Wang, Xiaomeng
    PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE, 2009, : 968 - 973
  • [7] A NEW FREQUENT ITEM SET MINING ALGORITHM BASED ON INTERVAL INTERSECTION
    Yungho-Leu
    Utami, Vania
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 471 - 477
  • [8] Unique Constraint Frequent Item Set Mining
    Greeshma, L.
    Pradeepini, G.
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 68 - 72
  • [9] Application of Frequent Item Set Mining Algorithm in IDS Based on Hadoop Framework
    Tong, Zhang
    Ying, Hou
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 1908 - 1911
  • [10] A Parallel Frequent Item Counting Algorithm
    Yang, Xun
    Liu, Jun
    Zhou, Wenli
    2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2, 2016, : 225 - 230