Parallel bisecting k-means with prediction clustering algorithm

被引:30
|
作者
Li, Yanjun [1 ]
Chung, Soon M. [1 ]
机构
[1] Wright State Univ, Dept Comp Sci & Engn, Dayton, OH 45435 USA
来源
JOURNAL OF SUPERCOMPUTING | 2007年 / 39卷 / 01期
关键词
clustering; bisecting k-means; parallel processing; performance analysis;
D O I
10.1007/s11227-006-0002-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new parallel clustering algorithm, named Parallel Bisecting k-means with Prediction (PBKP), for message-passing multiprocessor systems. Bisecting k-means tends to produce clusters of similar sizes, and according to our experiments, it produces clusters with smaller entropy (i.e., purer clusters) than k-means does. Our PBKP algorithm fully exploits the data-parallelism of the bisecting k-means algorithm, and adopts a prediction step to balance the workloads of multiple processors to achieve a high speedup. We implemented PBKP on a cluster of Linux workstations and analyzed its performance. Our experimental results show that the speedup of PBKP is linear with the number of processors and the number of data points. Moreover, PBKP scales up better than the parallel k-means with respect to the dimension and the desired number of clusters.
引用
收藏
页码:19 / 37
页数:19
相关论文
共 50 条
  • [1] Parallel bisecting k-means with prediction clustering algorithm
    Yanjun Li
    Soon M. Chung
    [J]. The Journal of Supercomputing, 2007, 39 : 19 - 37
  • [2] Drug Audit Based on Bisecting K-means Clustering Algorithm
    Tao, Yingjuan
    Deng, Jinsheng
    Song, Xingshen
    [J]. 2019 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2019, : 265 - 270
  • [3] A NOVEL APPROACH TOWARDS BISECTING K-MEANS CLUSTERING ALGORITHM PARALLELISM
    Zhang Junwei
    Wang Nianbin
    Huang Shaobin
    [J]. 2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 2, 2012, : 25 - 31
  • [4] Collaborative Filtering Recommendation Algorithm Based on Bisecting K-means Clustering
    Liu, Jia
    Kang, Xin
    Nishide, Shun
    Ren, Fuji
    [J]. INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2020, 2020, 11574
  • [5] Parallel K-means clustering algorithm on DNA dataset
    Othman, F
    Abdullah, R
    Rashid, NA
    Salam, RA
    [J]. PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 : 248 - 251
  • [6] An Improved parallel K-means Clustering Algorithm with MapReduce
    Liao, Qing
    Yang, Fan
    Zhao, Jingming
    [J]. 2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 764 - 768
  • [7] Enhanced Parallel Implementation of the K-Means Clustering Algorithm
    Baydoun, Mohammed
    Dawi, Mohammad
    Ghaziri, Hassan
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTATIONAL TOOLS FOR ENGINEERING APPLICATIONS (ACTEA), 2016, : 7 - 11
  • [8] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [9] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    [J]. 2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [10] Enhanced bisecting k-means clustering using intermediate cooperation
    Kashef, R.
    Kamel, M. S.
    [J]. PATTERN RECOGNITION, 2009, 42 (11) : 2557 - 2569