An Improved PrePost Algorithm for Frequent Pattern Mining with Hadoop on Cloud

被引:6
|
作者
Thakare, Sanket [1 ]
Rathi, Sheetal [1 ]
Sedamkar, R. R. [1 ]
机构
[1] Thakur Coll Engn & Technol, Bombay 400101, Maharashtra, India
关键词
Big Data; Data Mining; PPC Tree; Data Node; Name Node; Cloud Computing; S3; Storage;
D O I
10.1016/j.procs.2016.03.027
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Due to the advancement in internet technologies the volume of data is tremendously increasing day by day. The research is gaining importance in extracting valuable information from such huge amount of data. Many research works are done and various algorithms are proposed. The PrePost algorithm is one of well-known algorithms of frequent pattern mining. It is based on N-list data structure to mine frequent item-sets. But the performance of PrePost algorithm degrades when it comes to processing of large amount of data. Hadoop is very well known technique for processing such large amount of data. This paper proposes the Improved PrePost algorithm which combines the features of Hadoop in order to process large data efficiently. Efficiency of PrePost algorithm is enhanced by implementing compact PPC tree with the general tree method and finding frequent itemsets without generating candidate itemsets. An architecture of the Improved PrePost algorithm with public cloud is proposed. The results show that as dataset size is increased, the Improved PrePost algorithm gives 60% better performance.
引用
收藏
页码:207 / 214
页数:8
相关论文
共 50 条
  • [1] Implementation of an Improved Algorithm for Frequent Itemset Mining using Hadoop
    Agarwal, Ruchi
    Singh, Sunny
    Vats, Satvik
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 13 - 18
  • [2] An Improved Differential Privacy Algorithm Using Frequent Pattern Mining
    Yaling, Zhang
    Pei, Luo
    Lingyu, Qu
    [J]. 2019 15TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2019), 2019, : 419 - 423
  • [3] An improved EMASK algorithm for privacy-preserving frequent pattern mining
    Xu, CF
    Wang, JL
    Dan, HW
    Pan, YH
    [J]. COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 752 - 757
  • [4] Parallelization of Frequent Itemset Mining Methods with FP-tree: An Experiment with PrePost+ Algorithm
    Jamsheela, Olakara
    Gopalakrishna, Raju
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (02) : 208 - 213
  • [5] A New Algorithm for Mining Frequent Pattern
    李力
    靳蕃
    [J]. Railway Engineering Science, 2002, (01) : 10 - 20
  • [6] A Workflow Frequent Pattern Mining Algorithm
    Zhao, Weidong
    Ye, Mao
    [J]. 2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 3685 - 3688
  • [7] Effective Algorithm for Frequent Pattern Mining
    Aditya, S. P.
    Hemanth, M.
    Lakshmikanth, C. K.
    Suneetha, K. R.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON IOT AND ITS APPLICATIONS (IEEE ICIOT), 2017,
  • [8] FSM-H: Frequent Subgraph Mining Algorithm in Hadoop
    Bhuiyan, Mansurul A.
    Al Hasan, Mohammad
    [J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 9 - 16
  • [9] An Improved Algorithm for Frequent Itemsets Mining
    Jiang, Hao
    He, Xu
    [J]. 2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 314 - 317
  • [10] Parallel Implementation of PrePost Algorithm Based on Hadoop for Big Data
    Rochd, Yassir
    Hafidi, Imad
    [J]. 2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 24 - 28