An Analytic Survey on MapReduce based K-Means and its Hybrid Clustering Algorithms

被引:0
|
作者
Bagde, Utkarsha [1 ]
Tripathi, Priyanka [1 ]
机构
[1] NITTTR, Dept Comp Engn & Applicat, Bhopal, India
关键词
Clustering; K-Means; K-Harmonic Means; MapReduce;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The challenging task of today's era in data clustering is the common technique of arranging similar data into chunks. The traditional clustering algorithm is effective for handling large amount of data which comes from various sources such as social media, business, internet, etc. However, the time complexity of the serial calculation method is very high in these traditional algorithms. The K-Means algorithm is sensitive for initial points and local optimization and many times K-Means runs for K value. K-Harmonic Means is insensitive to the initialization of the centers and suitable for large scale datasets. To overcome these defects of traditional clustering algorithm, a hybrid method is suggested in this paper. MapReduce is a parallel programming model for distributed processing and generates data sets with a parallel, distributed algorithmic program on a cluster. In this paper, observations are given based on the different MapReduce algorithms. A new hybrid clustering algorithm based on MapReduce is proposed on those observations.
引用
收藏
页码:32 / 36
页数:5
相关论文
共 50 条
  • [1] A Novel MapReduce Based k-Means Clustering
    Sinha, Ankita
    Jana, Prasanta K.
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 247 - 255
  • [2] Parallel K-Means Clustering Based on MapReduce
    Zhao, Weizhong
    Ma, Huifang
    He, Qing
    [J]. CLOUD COMPUTING, PROCEEDINGS, 2009, 5931 : 674 - 679
  • [3] A Survey on Various K-Means algorithms for Clustering
    Singh, Malwinder
    Bansal, Meenakshi
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 60 - 65
  • [4] K-means Clustering Optimization Algorithm Based on MapReduce
    Li, Zhihua
    Song, Xudong
    Zhu, Wenhui
    Chen, Yanxia
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL SYMPOSIUM ON COMPUTERS & INFORMATICS, 2015, 13 : 198 - 203
  • [5] A MapReduce-based K-means clustering algorithm
    YiMin Mao
    DeJin Gan
    D. S. Mwakapesa
    Y. A. Nanehkaran
    Tao Tao
    XueYu Huang
    [J]. The Journal of Supercomputing, 2022, 78 : 5181 - 5202
  • [6] A MapReduce-based K-means clustering algorithm
    Mao, YiMin
    Gan, DeJin
    Mwakapesa, D. S.
    Nanehkaran, Y. A.
    Tao, Tao
    Huang, XueYu
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (04): : 5181 - 5202
  • [7] An Improved Sampling K-means Clustering Algorithm Based on MapReduce
    Zhang Ya-ling
    Wang Ya-nan
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,
  • [8] MapReduce Design of K-Means Clustering Algorithm
    Anchalia, Prajesh P.
    Koundinya, Anjan K.
    Srinath, N. K.
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA 2013), 2013,
  • [9] An Efficient K-means Clustering Algorithm on MapReduce
    Li, Qiuhong
    Wang, Peng
    Wang, Wei
    Hu, Hao
    Li, Zhongsheng
    Li, Junxian
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I, 2014, 8421 : 357 - 371
  • [10] A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets
    Sinha, Ankita
    Jana, Prasanta K.
    [J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (04): : 1562 - 1579