HOW THE INITIALIZATION AFFECTS THE STABILITY OF THE k-MEANS ALGORITHM

被引:27
|
作者
Bubeck, Sebastien [1 ]
Meila, Marina [2 ]
von Luxburg, Ulrike [3 ]
机构
[1] Ctr Recerca Matemat Barcelona, Barcelona, Spain
[2] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[3] Max Planck Inst Biol Cybernet, Tubingen, Germany
关键词
Clustering; k-means; stability; model selection;
D O I
10.1051/ps/2012013
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We investigate the role of the initialization for the stability of the k-means clustering algorithm. As opposed to other papers, we consider the actual k-means algorithm (also known as Lloyd algorithm). In particular we leverage on the property that this algorithm can get stuck in local optima of the k-means objective function. We are interested in the actual clustering, not only in the costs of the solution. We analyze when different initializations lead to the same local optimum, and when they lead to different local optima. This enables us to prove that it is reasonable to select the number of clusters based on stability scores.
引用
收藏
页码:436 / 452
页数:17
相关论文
共 50 条
  • [21] Greedy centroid initialization for federated K-means
    Yang, Kun
    Amiri, Mohammad Mohammadi
    Kulkarni, Sanjeev R.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (06) : 3393 - 3425
  • [22] How much can k-means be improved by using better initialization and repeats?
    Franti, Pasi
    Sieranoja, Sami
    [J]. PATTERN RECOGNITION, 2019, 93 : 95 - 112
  • [23] Hierarchical initialization approach for K-Means clustering
    Lu, J. F.
    Tang, J. B.
    Tang, Z. M.
    Yang, J. Y.
    [J]. PATTERN RECOGNITION LETTERS, 2008, 29 (06) : 787 - 795
  • [24] Stable Initialization Scheme for K-Means Clustering
    XU Junling1
    2. State Key Laboratory of Software Engineering
    3. Department of Computer
    [J]. Wuhan University Journal of Natural Sciences, 2009, 14 (01) : 24 - 28
  • [25] Initialization methods for remote sensing image clustering using K-means algorithm
    Zhong Y.-F.
    Zhang L.-P.
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2010, 32 (09): : 2009 - 2014
  • [26] A k-means clustering algorithm initialization for unsupervised statistical satellite image segmentation
    Rekik, Ahmed
    Zribi, Mourad
    Benjelloun, Mohammed
    ben Hamida, Ahmed
    [J]. 2006 1ST IEEE INTERNATIONAL CONFERENCE ON E-LEARNING IN INDUSTRIAL ELECTRONICS, 2006, : 11 - +
  • [27] EFFECTIVE INITIALIZATION OF K-MEANS FOR COLOR QUANTIZATION
    Celebi, M. Emre
    [J]. 2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1649 - 1652
  • [28] An Improved Initialization Center K-means Clustering Algorithm Based on Distance and Density
    Duan, Yanling
    Liu, Qun
    Xia, Shuyin
    [J]. ADVANCES IN MATERIALS, MACHINERY, ELECTRONICS II, 2018, 1955
  • [29] A Modified K-means Algorithm - Two-Layer K-means Algorithm
    Liu, Chen-Chung
    Chu, Shao-Wei
    Chan, Yung-Kuan
    Yu, Shyr-Shen
    [J]. 2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 447 - 450
  • [30] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67