Two-Stage Clustering with k-Means Algorithm

被引:0
|
作者
Salman, Raied [1 ]
Kecman, Vojislav [1 ]
Li, Qi [1 ]
Strack, Robert [1 ]
Test, Erick [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
Data Mining; Clustering; k-means algorithm; Distance Calculation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
k-means has recently been recognized as one of the best algorithms for clustering unsupervised data. Since the k-means depends mainly on distance calculation between all data points and the centers then the cost will be high when the size of the dataset is big (for example more than 500MG points). We suggested a two stage algorithm to reduce the cost of calculation for huge datasets. The first stage is fast calculation depending on small portion of the data to produce the best location of the centers. The second stage is the slow calculation in which the initial centers are taken from the first stage. The fast and slow stages are representing the movement of the centers. In the slow stage the whole dataset can be used to get the exact location of the centers. The cost of the calculation of the fast stage is very low due to the small size of the data chosen. The cost of the calculation of the slow stage is also small due to the low number of iterations.
引用
收藏
页码:110 / 122
页数:13
相关论文
共 50 条
  • [31] A novel two-stage hybrid default prediction model with k-means clustering and support vector domain description
    Yuan, Kunpeng
    Chi, Guotai
    Zhou, Ying
    Yin, Hailei
    RESEARCH IN INTERNATIONAL BUSINESS AND FINANCE, 2022, 59
  • [32] Research and Improvement on K-Means Clustering Algorithm
    Wang, Xue-mei
    Wang, Jin-bo
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 1138 - 1141
  • [33] Improvement of the k-means clustering filtering algorithm
    Lai, Jim Z. C.
    Liaw, Yi-Ching
    PATTERN RECOGNITION, 2008, 41 (12) : 3677 - 3681
  • [34] MapReduce Design of K-Means Clustering Algorithm
    Anchalia, Prajesh P.
    Koundinya, Anjan K.
    Srinath, N. K.
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA 2013), 2013,
  • [35] Clustering Algorithm Combining CPSO with K-Means
    Gu, Chunqin
    Tao, Qian
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS, 2015, 15 : 749 - 755
  • [36] Research on Improved K-means Clustering Algorithm
    Zhang, Yinsheng
    Shan, Huilin
    Li, Jiaqiang
    Zhou, Jie
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1977 - 1980
  • [37] The Global Kernel k-Means Clustering Algorithm
    Tzortzis, Grigorios
    Likas, Aristidis
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 1977 - 1984
  • [38] The Improvement and Application of a K-Means Clustering Algorithm
    Tao, Li Jun
    Hong, Liu Yin
    Yan, Hao
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016), 2016, : 93 - 96
  • [39] Clustering with Spectral Norm and the k-means Algorithm
    Kumar, Amit
    Kannan, Ravindran
    2010 IEEE 51ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, 2010, : 299 - 308
  • [40] An Improved Kernel K-means Clustering Algorithm
    Liu, Yang
    Yin, Hong Peng
    Chai, Yi
    PROCEEDINGS OF 2016 CHINESE INTELLIGENT SYSTEMS CONFERENCE, VOL I, 2016, 404 : 275 - 280