An Improved PrePost Algorithm for Frequent Pattern Mining with Hadoop on Cloud

被引:6
|
作者
Thakare, Sanket [1 ]
Rathi, Sheetal [1 ]
Sedamkar, R. R. [1 ]
机构
[1] Thakur Coll Engn & Technol, Bombay 400101, Maharashtra, India
关键词
Big Data; Data Mining; PPC Tree; Data Node; Name Node; Cloud Computing; S3; Storage;
D O I
10.1016/j.procs.2016.03.027
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Due to the advancement in internet technologies the volume of data is tremendously increasing day by day. The research is gaining importance in extracting valuable information from such huge amount of data. Many research works are done and various algorithms are proposed. The PrePost algorithm is one of well-known algorithms of frequent pattern mining. It is based on N-list data structure to mine frequent item-sets. But the performance of PrePost algorithm degrades when it comes to processing of large amount of data. Hadoop is very well known technique for processing such large amount of data. This paper proposes the Improved PrePost algorithm which combines the features of Hadoop in order to process large data efficiently. Efficiency of PrePost algorithm is enhanced by implementing compact PPC tree with the general tree method and finding frequent itemsets without generating candidate itemsets. An architecture of the Improved PrePost algorithm with public cloud is proposed. The results show that as dataset size is increased, the Improved PrePost algorithm gives 60% better performance.
引用
收藏
页码:207 / 214
页数:8
相关论文
共 50 条
  • [41] Parallelizing the Improved Algorithm for Frequent Patterns Mining Problem
    Thanh-Trung Nguyen
    Bach-Hien Nguyen
    Phi-Khu Nguyen
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT I,, 2013, 7802 : 156 - 165
  • [42] An efficient maximal frequent itemsets mining algorithm - Based on frequent pattern tree
    Xue, XR
    Wang, GY
    Wu, Y
    Yang, SX
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 1 : 176 - 181
  • [43] An Improved MapReduce Algorithm for Mining Closed Frequent Itemsets
    Gonen, Yaron
    Gudes, Ehud
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SCIENCE, TECHNOLOGY AND ENGINEERING (SWSTE 2016), 2016, : 77 - 83
  • [44] DFR: A new improved algorithm for mining frequent itemsets
    Chai, Sheng
    Wang, Hai-Chun
    Qiu, Ji-Fan
    [J]. FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 369 - 373
  • [45] The Algorithm for Mining Global Frequent Itemsets Based on Cloud Computing
    He, Bo
    [J]. WIRELESS COMMUNICATIONS, NETWORKING AND APPLICATIONS, WCNA 2014, 2016, 348 : 749 - 755
  • [46] Research on Algorithm for Mining Frequent Itemsets Based on Cloud Computing
    Huang Shu Zhuang
    Li Tao Shen
    Luo Dan
    [J]. MEASUREMENT TECHNOLOGY AND ITS APPLICATION, PTS 1 AND 2, 2013, 239-240 : 1303 - +
  • [47] An improved frequent pattern growth method for mining association rules
    Lin, Ke-Chung
    Liao, I-En
    Chen, Zhi-Sheng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5154 - 5161
  • [48] Privacy Preserving Frequent Pattern Mining on Multi-Cloud Environment
    Tai, Chih-Hua
    Huang, Jen-Wei
    Chung, Meng-Hao
    [J]. 2013 INTERNATIONAL SYMPOSIUM ON BIOMETRICS AND SECURITY TECHNOLOGIES (ISBAST), 2013, : 235 - 240
  • [49] An optimal text compression algorithm based on frequent pattern mining
    Oswald, C.
    Sivaselvan, B.
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2018, 9 (03) : 803 - 822
  • [50] Frequent subtree mining algorithm for ribonucleic acid topological pattern
    Li Z.
    Xu C.
    Liu C.
    [J]. Revue d'Intelligence Artificielle, 2019, 33 (01) : 75 - 80