An improved K-means algorithm for big data

被引：12

作者：

Moodi, Fatemeh ^{[1
]}

Saadatfar, Hamid ^{[2
]}

机构：

[1] Hormozan Higher Educ Inst, Comp Engn Dept, Birjand, Iran

[2] Univ Birjand, Comp Engn Dept, Univ Blvd, Birjand, Southern Khoras, Iran

来源：

IET SOFTWARE | 2022年 / 16卷 / 01期

关键词：

Iterative methods - K-means clustering;

D O I：

10.1049/sfw2.12032

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

An improved version of K-means clustering algorithm that can be applied to big data through lower processing loads with acceptable precision rates is presented here. In this method, the distances from one point to its two nearest centroids were used along with their variations in the last two iterations. Points with an equidistance threshold greater than the equidistance index were eliminated from the distance calculations and were stabilised in the cluster. Although these points are compared with the research index -cluster radius-again in the algorithm iteration, the excluded points are again included in the calculations if their distances from the stabilised cluster centroid are longer than the cluster radius. This can improve the clustering quality. Computerised tests as well as synthetic and real samples show that this method is able to improve the clustering quality by up to 41.85% in the best-case scenario. According to the findings, the proposed method is very beneficial to big data.

引用

页码：48 / 59

页数：12

共 50 条

[1] Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data
Xie, Ting
Liu, Ruihua
Wei, Zhengyuan
APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2020, 5 (01) : 1 - 10
[2] A K-Means Algorithm Application on Big Data
Eren, Beste
Karabulut, Ezgi Cilga
Alptekin, S. Emre
Alptekin, Gulfem Isiklar
WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2015, VOL II, 2015, : 814 - 818
[3] The Application of Big Data Mining Prediction Based on Improved K-Means Algorithm
Qiao, Yuchen
Li, Yunlu
Lv, Xiaotian
2019 34RD YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2019, : 348 - 351
[4] An Improvement to the K-means Algorithm Oriented to Big Data
Perez Ortega, Joaquin
Rodolfo Pazos, R.
Hidalgo, Miguel
Almanza, Nelva
Diaz-Parra, Ocotlan
Santaolaya, Rene
Caballero, Vitervo
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
[5] Modified K-means Algorithm for Big Data Clustering
Sengupta, Debapriya
Roy, Sayantan Singha
Ghosh, Sarbani
Dasgupta, Ranjan
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 1443 - 1448
[6] ANALYSIS AND APPLICATION OF BIG DATA FEATURE EXTRACTION BASED ON IMPROVED K-MEANS ALGORITHM
Yang, Wenjuan
SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (01): : 137 - 145
[7] SMK-means: An Improved Mini Batch K-means Algorithm Based on Mapreduce with Big Data
Xiao, Bo
Wang, Zhen
Liu, Qi
Liu, Xiaodong
CMC-COMPUTERS MATERIALS & CONTINUA, 2018, 56 (03): : 365 - 379
[8] Analysis of K-Means and K-Medoids Algorithm For Big Data
Arora, Preeti
Deepali
Varshney, Shipra
1ST INTERNATIONAL CONFERENCE ON INFORMATION SECURITY & PRIVACY 2015, 2016, 78 : 507 - 512
[9] Application of an improved K-Means algorithm in data mining
Wang, JM
Guo, H
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1 AND 2: INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT IN THE GLOBAL ECONOMY, 2005, : 416 - 419
[10] Enhancement of the K-Means Algorithm for Mixed Data in Big Data Platforms
Koren, Oded
Hallin, Carina Antonia
Perel, Nir
Bendet, Dror
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 1025 - 1040

← 1 2 3 4 5 →