Parallel Processing of Big Data using Power Iteration Clustering over MapReduce

被引:2
|
作者
Jayalatchumy, D. [1 ]
Thambidurai, P. [1 ]
Alamelu, A. Vasumathi [1 ]
机构
[1] PKIET, CSE, Karaikal, India
来源
2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014) | 2014年
关键词
p-PIC; Hadoop; Fault tolerance; GBC;
D O I
10.1109/WCCCT.2014.16
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Extracting useful information from dataset measuring in gigabytes and tetrabytes is a real challenge for data miners. Clustering algorithm have the problem of scalability while dealing with big data. The problem can be handled using parallel algorithm by executing them along with input data on high performance computer. The problem with graph based application requires much time for computation. PIC is an algorithm that is simple, fast, relatively scalable which requires the data and its associated matrix to fit in memory and this becomes infeasible for big data applications. Scalability has been increased using p-PIC and this paper focus on exploring different parallelization strategies for minimizing and compelling communication cost. The algorithm works on with a parallel framework MapReduce. p-PIC algorithm deals with Hadoop cloud a parallel store and computing platform implementing p-PIC using Hadoop framework.
引用
收藏
页码:176 / 178
页数:3
相关论文
共 50 条
  • [21] Parallel Data Processing with MapReduce: A Survey
    Lee, Kyong-Ha
    Lee, Yoon-Joon
    Choi, Hyunsik
    Chung, Yon Dohn
    Moon, Bongki
    SIGMOD RECORD, 2011, 40 (04) : 11 - 20
  • [22] Research on Parallel Processing Framework of Power Big Data
    Hu Bin
    Luo Li-ming
    Yang Pei
    Huang Tai-gui
    Zhang Li-ping
    2017 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND COMMUNICATIONS (ICCSC 2017), 2017, : 1 - 7
  • [23] PSCAN: A Parallel Structural Clustering Algorithm for Big Networks in MapReduce
    Zhao, Weizhong
    Martha, VenkataSwamy
    Xu, Xiaowei
    2013 IEEE 27TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2013, : 862 - 869
  • [24] K-Means Parallel Algorithm of Big Data Clustering Based on Mapreduce PCAM Method
    Li, Yongyi
    Yang, Zhongqiang
    Han, Kaixu
    Engineering Intelligent Systems, 2021, 29 (06): : 411 - 418
  • [25] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [26] MapReduce based Method for Big Data Semantic Clustering
    Yang, Jie
    Li, Xiaoping
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
  • [27] Big data clustering with varied density based on MapReduce
    Safanaz Heidari
    Mahmood Alborzi
    Reza Radfar
    Mohammad Ali Afsharkazemi
    Ali Rajabzadeh Ghatari
    Journal of Big Data, 6
  • [28] A Survey on Geographically Distributed Big-Data Processing Using MapReduce
    Dolev, Shlomi
    Florissi, Patricia
    Gudes, Ehud
    Sharma, Shantanu
    Singer, Ido
    IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (01) : 60 - 80
  • [29] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [30] Parallel clustering over large-scale data stream based on grid density using Hadoop MapReduce
    Cai, Binlei
    Zhu, Shiwei
    Guo, Qin
    Yu, Junfeng
    ICIC Express Letters, 2013, 7 (11): : 3075 - 3081