Parallel bisecting k-means with prediction clustering algorithm

被引:30
|
作者
Li, Yanjun [1 ]
Chung, Soon M. [1 ]
机构
[1] Wright State Univ, Dept Comp Sci & Engn, Dayton, OH 45435 USA
来源
JOURNAL OF SUPERCOMPUTING | 2007年 / 39卷 / 01期
关键词
clustering; bisecting k-means; parallel processing; performance analysis;
D O I
10.1007/s11227-006-0002-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new parallel clustering algorithm, named Parallel Bisecting k-means with Prediction (PBKP), for message-passing multiprocessor systems. Bisecting k-means tends to produce clusters of similar sizes, and according to our experiments, it produces clusters with smaller entropy (i.e., purer clusters) than k-means does. Our PBKP algorithm fully exploits the data-parallelism of the bisecting k-means algorithm, and adopts a prediction step to balance the workloads of multiple processors to achieve a high speedup. We implemented PBKP on a cluster of Linux workstations and analyzed its performance. Our experimental results show that the speedup of PBKP is linear with the number of processors and the number of data points. Moreover, PBKP scales up better than the parallel k-means with respect to the dimension and the desired number of clusters.
引用
收藏
页码:19 / 37
页数:19
相关论文
共 50 条
  • [31] The global k-means clustering algorithm
    Likas, A
    Vlassis, N
    Verbeek, JJ
    [J]. PATTERN RECOGNITION, 2003, 36 (02) : 451 - 461
  • [32] An improved K-means clustering algorithm
    Huang, Xiuchang
    Su, Wei
    [J]. Journal of Networks, 2014, 9 (01) : 161 - 167
  • [33] An Enhancement of K-means Clustering Algorithm
    Gu, Jirong
    Zhou, Jieming
    Chen, Xianwei
    [J]. 2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 237 - 240
  • [34] A k-means based clustering algorithm
    Bloisi, Domenico Daniele
    Locchi, Luca
    [J]. COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
  • [35] Adaptive K-Means clustering algorithm
    Chen, Hailin
    Wu, Xiuqing
    Hu, Junhua
    [J]. MIPPR 2007: PATTERN RECOGNITION AND COMPUTER VISION, 2007, 6788
  • [36] Improved Algorithm for the k-means Clustering
    Zhang, Sheng
    Wang, Shouqiang
    [J]. PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 4717 - 4720
  • [37] k*-means:: A new generalized k-means clustering algorithm
    Cheung, YM
    [J]. PATTERN RECOGNITION LETTERS, 2003, 24 (15) : 2883 - 2893
  • [38] Enhancement of Parallel K-Means Algorithm
    Mathew, Juby
    Vijayakumar, R.
    [J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [39] The study of parallel K-Means algorithm
    Zhang, Yufang
    Xiong, Zhongyang
    Mao, Jiali
    Ou, Ling
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 5868 - +
  • [40] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    [J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249