An efficient global K-means clustering algorithm based on weighted space partitioning

被引：0

作者：

Qu F.-H. ^{[1
]}

Pan Y.-T. ^{[1
]}

Yang Y. ^{[1
,2
]}

Hu Y.-T. ^{[3
]}

Song J.-F. ^{[1
]}

Wei C.-Y. ^{[3
]}

机构：

[1] College of Computer Science and Technology, Changchun University of Science and Technology, Changchun

[2] Jilin Technology College of Electronic Information, Jilin

[3] College of Information Technology, Jilin Agricultural University, Changchun

来源：

Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition) | 2024年 / 54卷 / 05期

关键词：

artificial intelligence; clustering center; incremental clustering; K-means algorithm; multidimensional grid space; weight;

D O I：

10.13229/j.cnki.jdxbgxb.20221338

中图分类号：

学科分类号：

摘要：

Aiming at the problem of large amount of calculation caused by exhaustive sample points in global K-means clustering algorithm，this paper proposes an efficient global K-means clustering algorithm based on weighted space partition. Firstly，the sample space is divided into grids，and then the density criterion and distance criterion are proposed to filter the grids，and the grids with large density and far distance from each other are retained as candidate center grids. In order to avoid the limitation that the global K-means algorithm only selects candidate centers in the sample set，the weight criterion and the center iteration strategy are proposed to expand the candidate centers and increase the diversity of the candidate centers. Finally，the candidate centers were traversed by incremental clustering to obtain the final clustering result. The experimental results on UCI data sets show that compared with the global K-means algorithm，the computational efficiency of the new algorithm is improved by 89.39%~95.79% on average under the premise of ensuring the clustering accuracy. Compared with K-means ++ ，IK-+ and the recently proposed CD algorithm，the new algorithm has higher accuracy and overcomes the problem of unstable clustering results caused by random initialization. © 2024 Editorial Board of Jilin University. All rights reserved.

引用

页码：1393 / 1400

页数：7

共 18 条

[1] Rahman M A, Islam M Z., A hybrid clustering tech⁃ nique combining a novel genetic algorithm with K-means, Knowledge-Based Systems, 71, pp. 345-365, (2014)
[2] Harjanti T W, Setiyani H, Trianto J, Et al., Classifi⁃ cation of mint leaf types based on the image using eu⁃ clidean distance and K-means clustering with shape and texture feature extraction, Tech-E, 5, 2, pp. 115-124, (2022)
[3] Liu Zhong-min, Li Zhan-ming, Li Bo-hao, Et al., Spectral clustering image segmentation based on sparse matrix, Journal of Jilin University (Engineer⁃ ing and Technology Edition), 47, 4, pp. 1308-1313, (2017)
[4] Lim Z Y, Ong L Y, Leow M C., A review on cluster⁃ ing techniques: creating better user experience for on⁃ line roadshow, Future Internet, 13, 9, (2021)
[5] Zhang Meng-su, Liu Chun-tian, Li Xi-jin, Et al., De⁃ sign of fuzzy comprehensive evaluation system for per⁃ formance appraisal based on K-means clustering algo⁃ rithm, Journal of Jilin University (Engineering and Technology Edition), 51, 5, pp. 1851-1856, (2021)
[6] Franti P, Sieranoja S., How much can K-means be impr-oved by using better initialization and repeats?, Pattern Recognition, 93, pp. 95-112, (2019)
[7] Yang Yong, Chen Qiang, Qu Fu-heng, Et al., SP-K-means-+ algorithm based on simulated partition, Journal of Jilin University (Engineering and Technolo⁃ gy Edition), 51, 5, pp. 1808-1816, (2021)
[8] Geng X, Mu Y, Mao S, Et al., An improved K-means algorithm based on fuzzy metrics, IEEE Ac⁃ cess, 8, pp. 217416-217424, (2020)
[9] Shao Lun, Zhou Xin-zhi, Zhao Cheng-ping, Et al., Im⁃ proved K-means clustering algorithm based on multi⁃ dimensional grid space, Computer Application, 38, 10, pp. 2850-2855, (2018)
[10] Manochandar S, Punniyamoorthy M, Jeyachitra R K., Development of new seed with modified validity measures for K-means clustering, Computers & In⁃ dustrial Engineering, 141, (2020)

← 1 2 →