An improved parallel FP-growth algorithm based on Spark and its application

被引:0
|
作者
Miao, Yuhang [1 ]
Lin, Jinxing [1 ]
Xu, Nuo [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China
关键词
Frequent itemset mining; Big data; Parallel FP-growth; Spark; Steam Turbine;
D O I
10.23919/chicc.2019.8866373
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Frequent itemset mining (FIM) is an important means for data analysis. With the increase of data size, single machine FIM algorithm has the problems of long time-consuming and high memory consumption. Parallel computing of mining algorithm on distributed machine can break through the performance bottleneck of single machine algorithm. In this paper, an improved parallel FP-growth algorithm based on Spark is presented. Firstly, the FP-growth algorithm is improved by matrix technology, compress data set into an information matrix can reduce memory consumption. Then, the improved FP-growth algorithm is parallelized on Spark. Finally, the proposed algorithm is applied to the performance optimization of steam turbine in thermal power plants. The result shows that the proposed algorithm is more efficient than the existing parallel FP-growth algorithm.
引用
收藏
页码:3793 / 3797
页数:5
相关论文
共 50 条
  • [1] A Parallel FP-growth Algorithm Based on GPU
    Jiang, Hao
    Meng, He
    [J]. 2017 IEEE 14TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2017), 2017, : 97 - 102
  • [2] DFPS: Distributed FP-growth Algorithm Based on Spark
    Shi, Xiujin
    Chen, Shaozong
    Yang, Hui
    [J]. 2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 1725 - 1731
  • [3] A Caching-Based Parallel FP-Growth in Apache Spark
    Cai, Zhicheng
    Zhu, Xingyu
    Zheng, Yuehui
    Liu, Duan
    Xu, Lei
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT III, 2018, 11336 : 519 - 533
  • [4] Improvement and research of FP-growth algorithm based on distributed spark
    Deng Lingling
    Lou Yuansheng
    [J]. 2015 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2015, : 105 - 108
  • [5] Improved FP-Growth Algorithm Based on Spark Platform and its Analysis and Excavation in Clinical Data of Lung Cancer
    Fang Pei-pei
    Xie Jia-dong
    Yang Tao
    Hu Kong-fa
    Hu Chen-jun
    Mao Yu-qing
    He Ju
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS, ELECTRONICS AND CONTROL (ICCSEC), 2017, : 1447 - 1450
  • [6] Towards Enhancing the Performance of Parallel FP-Growth on Spark
    Essam, Amr
    Abdel-Fattah, Manal A.
    Abdelhamid, Laila
    [J]. IEEE ACCESS, 2022, 10 : 286 - 296
  • [7] An Improved FP-Growth Algorithm Based on SOM Partition
    Jia, Kuikui
    Liu, Haibin
    [J]. DATA SCIENCE, PT 1, 2017, 727 : 166 - 178
  • [8] Research and Application on Web Information Retrieval Based on Improved FP-Growth Algorithm
    JIAO Minghai~ 1
    2.School of Information Science and Engineering
    [J]. Wuhan University Journal of Natural Sciences, 2006, (05) : 1065 - 1068
  • [9] Research on Association Rules Parallel Algorithm Based on FP-Growth
    Chen, Ke
    Zhang, Lijun
    Li, Sansi
    Ke, Wende
    [J]. INFORMATION COMPUTING AND APPLICATIONS, PT II, 2011, 244 : 249 - +
  • [10] S-FPG: A Parallel Version of FP-Growth Algorithm under Apache Spark™
    Gassama, Aissatou Diaby Dite
    Camara, Fode
    Ndiaye, Samba
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 98 - 101