Improvement and research of FP-growth algorithm based on distributed spark

被引:10
|
作者
Deng Lingling [1 ]
Lou Yuansheng [1 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing, Jiangsu, Peoples R China
关键词
Frequent item sets; FP-growth; Spark; Complexity; Efficiency;
D O I
10.1109/CCBD.2015.15
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
FP-growth algorithm as the representatives of non-pruning algorithms is widely used in mining transaction datasets. But it is sensitive to the calculation and the scale of datasets. When building FP-tree, the search operation as the major time-consuming operation has a higher complexity. And when the horizontal or vertical dimension of data set is larger, the mining efficiency will be reduced or even failed. To solve the above problems, reducing the complexity of search time and applying distributed computing are the widely used strategies. This paper presents a distributed SPFP algorithm based on Spark framework and improved FP-growth algorithm. The results of tests show that, compared to the PFP algorithm based on MapReduce, the OPFP algorithm based on Spark and original FP-growth algorithm, SPFP has high efficiency, cluster and flexibility.
引用
收藏
页码:105 / 108
页数:4
相关论文
共 50 条
  • [1] DFPS: Distributed FP-growth Algorithm Based on Spark
    Shi, Xiujin
    Chen, Shaozong
    Yang, Hui
    [J]. 2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 1725 - 1731
  • [2] The Research and Improvement Based on FP-Growth Data Mining Algorithm
    Yao, Quanzhu
    Gao, Xingxing
    Lei, Xueli
    Zhang, Tong
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND OPTIMIZATION TECHNOLOGIES AND APPLICATIONS (MSOTA2016), 2016, 58 : 287 - 293
  • [3] Research on Association Rule Algorithm Based on Distributed and Weighted FP-Growth
    Wang, Huaibin
    Liu, Yuanchao
    Wang, Chundong
    [J]. ADVANCES IN MULTIMEDIA, SOFTWARE ENGINEERING AND COMPUTING, VOL 1, 2011, 128 : 133 - 138
  • [4] Research and Improvement of Parallelization of FP - Growth Algorithm Based on Spark
    Zhang, Fan
    Xiao, Youan
    Long, Yihong
    [J]. PROCEEDINGS OF 2017 IEEE 7TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC), 2017, : 145 - 148
  • [5] Research on FP-Growth Algorithm for Massive Telecommunication Network Alarm Data based on Spark
    Li, Chuan
    Huang, Xiaojun
    [J]. PROCEEDINGS OF 2016 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2016), 2016, : 875 - 879
  • [6] An improved parallel FP-growth algorithm based on Spark and its application
    Miao, Yuhang
    Lin, Jinxing
    Xu, Nuo
    [J]. PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 3793 - 3797
  • [7] Research on Association Rules Parallel Algorithm Based on FP-Growth
    Chen, Ke
    Zhang, Lijun
    Li, Sansi
    Ke, Wende
    [J]. INFORMATION COMPUTING AND APPLICATIONS, PT II, 2011, 244 : 249 - +
  • [8] Research and Improvement of Intrusion Detection Based on Isolated Forest and FP-Growth
    Zhou, Yansen
    Cui, Jianquan
    Liu, Qi
    [J]. 2020 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2020, : 160 - 164
  • [9] The Application of FP-Growth Algorithm Based on Distributed Intelligence in Wisdom Medical Treatment
    Xu, Fangqin
    Lu, Haifeng
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (04)
  • [10] Distributed pruning optimization oriented FP-Growth method based on PSO algorithm
    Wei, Hong
    Luo, Qixing
    Chen, Zexi
    Chen, Yingzhe
    [J]. PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 1244 - 1248