Scalable Parallel Clustering Approach for Large Data Using Parallel K Means and Firefly Algorithms

被引:0
|
作者
Mathew, Juby [1 ]
Vijayakumar, R. [2 ]
机构
[1] Amaljyothi Coll Engn, Dept MCA, Kanjirappally, Kerala, India
[2] Mahatma Gandhi Univ, Kottayam, Kerala, India
来源
2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA) | 2014年
关键词
Clustering; k-means; parallel k-means; Firefly algorithm; join and fork parallelism;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper mainly focuses in identifying the limitations of the k means algorithm and to propose the parallelization of the k-means using firefly based clustering method. The new parallel architecture can handle large number of clusters. Firefly algorithm to find initial optimal cluster centroid and then k-means algorithm with optimized centroid to refined them and improve clustering accuracy. The final convergence issue is also addressed and solved to a great extent. Finally modified algorithm is compared with parallel k means is demonstrated with experiments and it has been found that the performance of modified algorithm is better than the existing algorithm. Four typical benchmark data sets from the UCI machine learning repository are used to demonstrate the results of the techniques. To achieve this we can use fork/join method in java programming. It is the most effective design method for achieve good parallel performance
引用
收藏
页数:8
相关论文
共 50 条
  • [21] A survey on parallel clustering algorithms for Big Data
    Dafir, Zineb
    Lamari, Yasmine
    Slaoui, Said Chah
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) : 2411 - 2443
  • [22] A survey on parallel clustering algorithms for Big Data
    Zineb Dafir
    Yasmine Lamari
    Said Chah Slaoui
    Artificial Intelligence Review, 2021, 54 : 2411 - 2443
  • [23] PBIRCH: A scalable parallel clustering algorithm for incremental data
    Garg, Ashwani
    Mangla, Ashish
    Gupta, Neelima
    Bhatnagar, Vasudha
    10TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2006, : 315 - +
  • [24] HdK-Means: Hadoop Based Parallel K-Means Clustering for Big Data
    Bandyopadhyay, Soumyendu Sekhar
    Halder, Anup Kumar
    Chatterjee, Piyali
    Nasipuri, Mita
    Basu, Subhadip
    2017 IEEE CALCUTTA CONFERENCE (CALCON), 2017, : 452 - 456
  • [25] A MapReduce-based parallel K-means clustering for large-scale CIM data verification
    Deng, Chuang
    Liu, Yang
    Xu, Lixiong
    Yang, Jie
    Liu, Junyong
    Li, Siguang
    Li, Maozhen
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (11): : 3096 - 3114
  • [26] PARALLEL CLUSTERING ALGORITHMS
    LI, X
    FANG, Z
    PARALLEL COMPUTING, 1989, 11 (03) : 275 - 290
  • [27] An Improved approach for K-Means using Parallel Processing
    Swamy, Prateek
    Raghuwanshi, M. M.
    Gholghate, Ashish
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 358 - 361
  • [28] Parallel K-Means Clustering Based on MapReduce
    Zhao, Weizhong
    Ma, Huifang
    He, Qing
    CLOUD COMPUTING, PROCEEDINGS, 2009, 5931 : 674 - 679
  • [29] Scalable parallel algorithms for surface fitting and data mining
    Christen, P
    Hegland, M
    Nielsen, OM
    Roberts, S
    Strazdins, PE
    Altas, I
    PARALLEL COMPUTING, 2001, 27 (07) : 941 - 961
  • [30] Efficient Parallel Algorithms for k-Center Clustering
    McClintock, Jessica
    Wirth, Anthony
    PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, : 133 - 138