Improved initial cluster center selection in K-means clustering

被引:16
|
作者
Zhu, Minchen [1 ]
Wang, Weizhi [2 ]
Huang, Jingshan [3 ]
机构
[1] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China
[2] Fuzhou Univ, Coll Civil Engn, Fuzhou 350002, Peoples R China
[3] Univ S Alabama, Sch Comp, Mobile, AL 36688 USA
关键词
Initial cluster centre; Inner-class distance; Inter-class distance; K-means clustering; ALGORITHM;
D O I
10.1108/EC-11-2012-0288
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Purpose - It is well known that the selection of initial cluster centers can significantly affect K-means clustering results. The purpose of this paper is to propose an improved, efficient methodology to handle such a challenge. Design/methodology/approach - According to the fact that the inner-class distance among samples within the same cluster is supposed to be smaller than the inter-class distance among clusters, the algorithm will dynamically adjust initial cluster centers that are randomly selected. Consequently, such adjusted initial cluster centers will be highly representative in the sense that they are distributed among as many samples as possible. As a result, local optima that are common in K-means clustering can then be effectively reduced. In addition, the algorithm is able to obtain all initial cluster centers simultaneously (instead of one center at a time) during the dynamic adjustment. Findings - Experimental results demonstrate that the proposed algorithm greatly improves the accuracy of traditional K-means clustering results and, in a more efficient manner. Originality/value - The authors presented in this paper an efficient algorithm, which is able to dynamically adjust initial cluster centers that are randomly selected. The adjusted centers are highly representative, i. e. they are distributed among as many samples as possible. As a result, local optima that are common in K-means clustering can be effectively reduced so that the authors can achieve an improved clustering accuracy. In addition, the algorithm is a cost-efficient one and the enhanced clustering accuracy can be obtained in a more efficient manner compared with traditional K-means algorithm.
引用
收藏
页码:1661 / 1667
页数:7
相关论文
共 50 条
  • [31] An Improved Method for K-Means Clustering
    Cui, Xiaowei
    Wang, Fuxiang
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 756 - 759
  • [32] Improved Algorithm for the k-means Clustering
    Zhang, Sheng
    Wang, Shouqiang
    [J]. PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 4717 - 4720
  • [33] A new method for selecting initial cluster centers in k-means clustering algorithm
    Zhang, Guoying
    Sha, Yun
    He, Yuanjiao
    [J]. 2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 2, 2008, : 879 - 883
  • [34] Neighborhood density method for selecting initial cluster centers in k-means clustering
    Ye, Yunming
    Huang, Joshua Zhexue
    Chen, Xiaojun
    Zhou, Shuigeng
    Williams, Graham
    Xu, Xiaofei
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 189 - 198
  • [35] Semi-supervised K-Means Clustering by Optimizing Initial Cluster Centers
    Wang, Xin
    Wang, Chaofei
    Shen, Junyi
    [J]. WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 178 - +
  • [36] Improved K-means Algorithm to Quickly Locate Optimum Initial Clustering Number K
    Yang Qing
    Liu Ye
    Zhang Dongxu
    Liu Chang
    [J]. 2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 3319 - 3322
  • [37] Selection of cluster hierarchy depth in hierarchical clustering using K-means algorithm
    Lee, Shinwon
    Lee, Wonhee
    Chung, Sungjong
    An, Dongun
    Bok, Ingeun
    Ryu, Hongjin
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON INFORMATION TECHNOLOGY CONVERGENCE, PROCEEDINGS, 2007, : 27 - +
  • [38] K-means Optimization algorithms of initial clustering center based on regional density
    He, Yanxiang
    Cai, Rui
    Wu, Libing
    Li, Fei
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 478 - 482
  • [39] A Median based External Initial Centroid Selection Method for K-means Clustering
    SampathPremkumar, M.
    Ganesh, S. Hari
    [J]. 2017 2ND WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT), 2017, : 143 - 146
  • [40] Improved Fuzzy K-means Clustering Based on Imbalanced Measure of Cluster Sizes
    Wang, Qiang
    Zhang, Tengfei
    Ma, Fumin
    Wang, Yulong
    Yue, Dong
    [J]. PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 548 - 551