Two-Stage Clustering with k-Means Algorithm

被引:0
|
作者
Salman, Raied [1 ]
Kecman, Vojislav [1 ]
Li, Qi [1 ]
Strack, Robert [1 ]
Test, Erick [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
Data Mining; Clustering; k-means algorithm; Distance Calculation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since the k-means depends mainly on distance calculation between all data points and the centers then the cost will be high when the size of the dataset is big (for example more than 500MG points). We suggested a two stage algorithm to reduce the cost of calculation for huge datasets. The first stage is fast calculation depending on small portion of the data to produce the best location of the centers. The second stage is the slow calculation in which the initial centers are taken from the first stage. The fast and slow stages are representing the movement of the centers. In the slow stage the whole dataset can be used to get the exact location of the centers. The cost of the calculation of the fast stage is very low due to the small size of the data chosen. The cost of the calculation of the slow stage is also small due to the low number of iterations.
引用
收藏
页码:110 / 122
页数:13
相关论文
共 50 条
  • [11] An Improved K-means Clustering Algorithm
    Wang Yintong
    Li Wanlong
    Gao Rujia
    2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
  • [12] Granular K-means Clustering Algorithm
    Zhou, Chenglong
    Chen, Yuming
    Zhu, Yidong
    Computer Engineering and Applications, 2023, 59 (13) : 317 - 324
  • [13] Unsupervised K-Means Clustering Algorithm
    Sinaga, Kristina P.
    Yang, Miin-Shen
    IEEE ACCESS, 2020, 8 : 80716 - 80727
  • [14] Modified k-Means Clustering Algorithm
    Patel, Vaishali R.
    Mehta, Rupa G.
    COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2011, 250 : 307 - +
  • [15] Modified K-means clustering algorithm
    Li, Wei
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 4, PROCEEDINGS, 2008, : 618 - 621
  • [16] The MinMax k-Means clustering algorithm
    Tzortzis, Grigorios
    Likas, Aristidis
    PATTERN RECOGNITION, 2014, 47 (07) : 2505 - 2516
  • [17] The global k-means clustering algorithm
    Likas, A
    Vlassis, N
    Verbeek, JJ
    PATTERN RECOGNITION, 2003, 36 (02) : 451 - 461
  • [18] Improved K-means clustering algorithm
    Zhang, Zhe
    Zhang, Junxi
    Xue, Huifeng
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 169 - 172
  • [19] A k-means based clustering algorithm
    Bloisi, Domenico Daniele
    Locchi, Luca
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
  • [20] An improved K-means clustering algorithm
    Huang, Xiuchang
    Su, Wei
    Journal of Networks, 2014, 9 (01) : 161 - 167