An improved K-means algorithm for big data

被引:12
|
作者
Moodi, Fatemeh [1 ]
Saadatfar, Hamid [2 ]
机构
[1] Hormozan Higher Educ Inst, Comp Engn Dept, Birjand, Iran
[2] Univ Birjand, Comp Engn Dept, Univ Blvd, Birjand, Southern Khoras, Iran
关键词
Iterative methods - K-means clustering;
D O I
10.1049/sfw2.12032
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An improved version of K-means clustering algorithm that can be applied to big data through lower processing loads with acceptable precision rates is presented here. In this method, the distances from one point to its two nearest centroids were used along with their variations in the last two iterations. Points with an equidistance threshold greater than the equidistance index were eliminated from the distance calculations and were stabilised in the cluster. Although these points are compared with the research index -cluster radius-again in the algorithm iteration, the excluded points are again included in the calculations if their distances from the stabilised cluster centroid are longer than the cluster radius. This can improve the clustering quality. Computerised tests as well as synthetic and real samples show that this method is able to improve the clustering quality by up to 41.85% in the best-case scenario. According to the findings, the proposed method is very beneficial to big data.
引用
收藏
页码:48 / 59
页数:12
相关论文
共 50 条
  • [1] Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data
    Xie, Ting
    Liu, Ruihua
    Wei, Zhengyuan
    APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2020, 5 (01) : 1 - 10
  • [2] A K-Means Algorithm Application on Big Data
    Eren, Beste
    Karabulut, Ezgi Cilga
    Alptekin, S. Emre
    Alptekin, Gulfem Isiklar
    WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2015, VOL II, 2015, : 814 - 818
  • [3] The Application of Big Data Mining Prediction Based on Improved K-Means Algorithm
    Qiao, Yuchen
    Li, Yunlu
    Lv, Xiaotian
    2019 34RD YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2019, : 348 - 351
  • [4] An Improvement to the K-means Algorithm Oriented to Big Data
    Perez Ortega, Joaquin
    Rodolfo Pazos, R.
    Hidalgo, Miguel
    Almanza, Nelva
    Diaz-Parra, Ocotlan
    Santaolaya, Rene
    Caballero, Vitervo
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
  • [5] Modified K-means Algorithm for Big Data Clustering
    Sengupta, Debapriya
    Roy, Sayantan Singha
    Ghosh, Sarbani
    Dasgupta, Ranjan
    PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 1443 - 1448
  • [6] ANALYSIS AND APPLICATION OF BIG DATA FEATURE EXTRACTION BASED ON IMPROVED K-MEANS ALGORITHM
    Yang, Wenjuan
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (01): : 137 - 145
  • [7] SMK-means: An Improved Mini Batch K-means Algorithm Based on Mapreduce with Big Data
    Xiao, Bo
    Wang, Zhen
    Liu, Qi
    Liu, Xiaodong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2018, 56 (03): : 365 - 379
  • [8] Analysis of K-Means and K-Medoids Algorithm For Big Data
    Arora, Preeti
    Deepali
    Varshney, Shipra
    1ST INTERNATIONAL CONFERENCE ON INFORMATION SECURITY & PRIVACY 2015, 2016, 78 : 507 - 512
  • [9] Application of an improved K-Means algorithm in data mining
    Wang, JM
    Guo, H
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1 AND 2: INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT IN THE GLOBAL ECONOMY, 2005, : 416 - 419
  • [10] Enhancement of the K-Means Algorithm for Mixed Data in Big Data Platforms
    Koren, Oded
    Hallin, Carina Antonia
    Perel, Nir
    Bendet, Dror
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 1025 - 1040