Two-Stage Clustering with k-Means Algorithm

被引:0
|
作者
Salman, Raied [1 ]
Kecman, Vojislav [1 ]
Li, Qi [1 ]
Strack, Robert [1 ]
Test, Erick [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
Data Mining; Clustering; k-means algorithm; Distance Calculation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since the k-means depends mainly on distance calculation between all data points and the centers then the cost will be high when the size of the dataset is big (for example more than 500MG points). We suggested a two stage algorithm to reduce the cost of calculation for huge datasets. The first stage is fast calculation depending on small portion of the data to produce the best location of the centers. The second stage is the slow calculation in which the initial centers are taken from the first stage. The fast and slow stages are representing the movement of the centers. In the slow stage the whole dataset can be used to get the exact location of the centers. The cost of the calculation of the fast stage is very low due to the small size of the data chosen. The cost of the calculation of the slow stage is also small due to the low number of iterations.
引用
收藏
页码:110 / 122
页数:13
相关论文
共 50 条
  • [41] A Clustering Method Based on K-Means Algorithm
    Li, Youguo
    Wu, Haiyan
    INTERNATIONAL CONFERENCE ON SOLID STATE DEVICES AND MATERIALS SCIENCE, 2012, 25 : 1104 - 1109
  • [42] Efficient enhanced k-means clustering algorithm
    Fahim A.M.
    Salem A.M.
    Torkey F.A.
    Ramadan M.A.
    Journal of Zhejiang University-SCIENCE A, 2006, 7 (10): : 1626 - 1633
  • [43] A Modified K-means Algorithm for Sequence Clustering
    Hsu, Jia-Lien
    Yang, Hong-Xiang
    HIS 2009: 2009 NINTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, VOL 1, PROCEEDINGS, 2009, : 287 - 292
  • [44] Clustering Performance of an Evolutionary K-Means Algorithm
    Nigro, Libero
    Cicirelli, Franco
    Pupo, Francesco
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 9, ICICT 2024, 2025, 1054 : 359 - 369
  • [45] A Novel ELM K-Means Algorithm for Clustering
    Alshamiri, Abobakr Khalil
    Surampudi, Bapi Raju
    Singh, Alok
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, SEMCCO 2014, 2015, 8947 : 212 - 222
  • [46] An efficient enhanced k-means clustering algorithm
    FAHIM A.M
    SALEM A.M
    TORKEY F.A
    RAMADAN M.A
    Journal of Zhejiang University Science A(Science in Engineering), 2006, (10) : 1626 - 1633
  • [47] Research on improved K-means clustering algorithm
    Zhang, Yinsheng
    Shan, Huilin
    Li, Jiaqiang
    Zhou, Jie
    Advanced Materials Research, 2012, 403-408 : 1977 - 1980
  • [48] An Efficient Global K-means Clustering Algorithm
    Xie, Juanying
    Jiang, Shuai
    Xie, Weixin
    Gao, Xinbo
    JOURNAL OF COMPUTERS, 2011, 6 (02) : 271 - 279
  • [49] A more efficient algorithm for K-means clustering
    Wang, Shouqiang
    Zhu, Daming
    Journal of Computational Information Systems, 2007, 3 (05): : 1951 - 1956
  • [50] Clustering with Niching Genetic K-means algorithm
    Sheng, WG
    Tucker, A
    Liu, XH
    GENETIC AND EVOLUTIONARY COMPUTATION GECCO 2004 , PT 2, PROCEEDINGS, 2004, 3103 : 162 - 173