A deterministic method for initializing K-means clustering

被引:0
|
作者
Su, T [1 ]
Dy, J [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of K-means clustering depends on the initial guess of partition. In this paper we motivate theoretically and experimentally the use of a deterministic divisive hierarchical method, which we refer to as PCA-Part (Principal Components Analysis Partitioning)for initialization. The criterion that K-means clustering minimizes is the SSE (sum-squared-error) criterion. The first principal direction (the eigenvector corresponding to the largest eigenvalue of the covariance matrix) is the direction which contributes the largest SSE. Hence, a good candidate direction to project a cluster for splitting is, then, the first principal direction. This is the basis for PCA-Part initialization method. Our experiments reveal that generally PCA-Part leads K-means to generate clusters with SSE values close to the minimum SSE values obtained by one hundred random start runs. In addition, this deterministic initialization method often leads K-means to faster convergence (less iterations) compared to random methods. Furthermore, we also theoretically show and confirm experimentally on synthetic data when PCA-Part may fail.
引用
收藏
页码:784 / 786
页数:3
相关论文
共 50 条
  • [21] DK-means: a deterministic K-means clustering algorithm for gene expression analysis
    R. Jothi
    Sraban Kumar Mohanty
    Aparajita Ojha
    Pattern Analysis and Applications, 2019, 22 : 649 - 667
  • [22] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    ALGORITHMS, 2018, 11 (10):
  • [23] CLUSTERING VIDEO SEQUENCES BY THE METHOD OF HARMONIC K-MEANS
    Mashtalir, S. V.
    Stolbovyi, M. I.
    Yakovlev, S. V.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2019, 55 (02) : 200 - 206
  • [24] Single pass kernel k-means clustering method
    T HITENDRA SARMA
    P VISWANATH
    B ESWARA REDDY
    Sadhana, 2013, 38 : 407 - 419
  • [25] An extended version of the k-means method for overlapping clustering
    Cleuziou, Guillaume
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 563 - 566
  • [26] Single pass kernel k-means clustering method
    Sarma, T. Hitendra
    Viswanath, P.
    Reddy, B. Eswara
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2013, 38 (03): : 407 - 419
  • [27] K-means based method for overlapping document clustering
    Beltran, Beatriz
    Vilarino, Darnes
    Martinez-Trinidad, Jose Fco.
    Carrasco-Ochoa, J. A.
    Pinto, David
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2127 - 2135
  • [28] A Comparative Study on k-means Clustering Method and Analysis
    Baruri, Rajdeep
    Ghosh, Anannya
    Chanda, Saikat
    Banerjee, Ranjan
    Das, Anindya
    Mandal, Arindam
    Halder, Tapas
    EMERGING TECHNOLOGIES IN COMPUTER ENGINEERING: MICROSERVICES IN BIG DATA ANALYTICS, 2019, 985 : 113 - 127
  • [29] AN INTELLIGENT INITIALIZATION METHOD FOR THE K-MEANS CLUSTERING ALGORITHM
    Sheu, Jyh-Jian
    Chen, Wei-Ming
    Tsai, Wen-Bin
    Chu, Ko-Tsung
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (06): : 2551 - 2566
  • [30] Clustering Video Sequences by the Method of Harmonic k-Means
    S. V. Mashtalir
    M. I. Stolbovyi
    S. V. Yakovlev
    Cybernetics and Systems Analysis, 2019, 55 : 200 - 206