Performance Analysis of Parallel K-Means with Optimization Algorithms for Clustering on Spark

被引:4
|
作者
Santhi, V. [1 ]
Jose, Rini [1 ]
机构
[1] Anna Univ, PSG Coll Technol, Coimbatore, Tamil Nadu, India
关键词
Clustering; K-Means; Bat algorithm; Firefly algorithm; Big data; Spark;
D O I
10.1007/978-3-319-72344-0_12
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering divides data into meaningful, useful groups known as clusters without any prior knowledge about the data. One of the drawbacks of K-Means clustering is the estimation of initial centroids which influence the performance of the algorithm. To overcome this issue, optimization algorithms like Bat and Firefly are executed as pre-processing step. These algorithms return optimal centroids which is given as input to the K-Means algorithm. Clustering is carried out on large data sets, therefore Apache Spark, an open source software framework is used. The performance of the optimization algorithms is evaluated and the best algorithm is determined.
引用
收藏
页码:158 / 162
页数:5
相关论文
共 50 条
  • [21] Stability analysis in K-means clustering
    Steinley, Douglas
    [J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2008, 61 : 255 - 273
  • [22] A practical comparison of two K-Means clustering algorithms
    Gregory A Wilkin
    Xiuzhen Huang
    [J]. BMC Bioinformatics, 9
  • [23] Comparison of distributed evolutionary k-means clustering algorithms
    Naldi, M. C.
    Campello, R. J. G. B.
    [J]. NEUROCOMPUTING, 2015, 163 : 78 - 93
  • [24] Parallel K-means clustering algorithm on DNA dataset
    Othman, F
    Abdullah, R
    Rashid, NA
    Salam, RA
    [J]. PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 : 248 - 251
  • [25] An Improved parallel K-means Clustering Algorithm with MapReduce
    Liao, Qing
    Yang, Fan
    Zhao, Jingming
    [J]. 2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 764 - 768
  • [26] Parallel BVH construction using k-means clustering
    Daniel Meister
    Jiří Bittner
    [J]. The Visual Computer, 2016, 32 : 977 - 987
  • [27] Parallel BVH construction using k-means clustering
    Meister, Daniel
    Bittner, Jiri
    [J]. VISUAL COMPUTER, 2016, 32 (6-8): : 977 - 987
  • [28] Improving K-means clustering with enhanced Firefly Algorithms
    Xie, Hailun
    Zhang, Li
    Lim, Chee Peng
    Yu, Yonghong
    Liu, Chengyu
    Liu, Han
    Walters, Julie
    [J]. APPLIED SOFT COMPUTING, 2019, 84
  • [29] Algorithms for K-means Clustering Problem with Balancing Constraint
    Wang Shouqiang
    Chi Zengxiao
    Zhan Sheng
    [J]. CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 3967 - 3972
  • [30] An Enhanced K-Means Genetic Algorithms for Optimal Clustering
    Anusha, M.
    Sathiaseelan, J. G. R.
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 580 - 584