A Comparative Performance Analysis of Fast K-Means Clustering Algorithms

被引：0

作者：

Beecks, Christian ^{[1
]}

Berns, Fabian ^{[1
]}

Huewel, Jan David ^{[1
]}

Linxen, Andrea ^{[1
]}

Schlake, Georg Stefan ^{[1
]}

Duesterhus, Tim ^{[2
]}

机构：

[1] Univ Hagen, Hagen, Germany

[2] Univ Munster, Munster, Germany

来源：

INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022 | 2022年 / 13635卷

关键词：

Data mining; Clustering; Performance evaluation;

D O I：

10.1007/978-3-031-21047-1_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data clustering is a fundamental and widespread problem in computer science, which has become very attractive in both scientific communities and application domains. Among the different algorithmic methods, the k-means algorithm, and its prominent implementation, the Lloyd algorithm, has developed into a de facto standard for partitioningbased clustering. This algorithm, however, turns out to be inefficient on very large databases. In order to mitigate this efficiency issue, several fast k-means algorithms for ad-hoc and exact data clustering have been proposed in the literature. Since their inner workings and applied pruning criteria differ, it is difficult to predict the efficiency of individual algorithms in certain application scenarios. We thus present a performance analysis of existing fast k-means algorithms. We focus on simple interpretability and comparability and abstract from many implementation details so as to provide a guide for data scientists and practitioners alike.

引用

页码：119 / 125

页数：7

共 50 条

[1] K-means - a fast and efficient K-means algorithms
[J]. Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11)
[2] Performance Analysis of Parallel K-Means with Optimization Algorithms for Clustering on Spark
Santhi, V.
Jose, Rini
[J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2018), 2018, 10722 : 158 - 162
[3] Comparative Analysis of K-Means with other Clustering Algorithms to Improve Search Result
Mehrotra, Shashi
Kohli, Shruti
[J]. 2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 309 - 313
[4] A Comparative Study of K-Means, K-Means plus plus and Fuzzy C-Means Clustering Algorithms
Kapoor, Akanksha
Singhal, Abhishek
[J]. 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2017,
[5] COMPARATIVE ANALYSIS OF K-MEANS AND DBSCAN ALGORITHMS
Zurini, Madalina
[J]. INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, 2013, : 646 - 651
[6] Finding the k in K-means Clustering: A Comparative Analysis Approach
Lumpe, Markus
Quoc Bao Vo
[J]. AI 2015: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2015, 9457 : 356 - 364
[7] Performance Analysis of K-Means Seeding Algorithms
Ortiz-Bejar, Jose
Tellez, Eric S.
Graff, Mario
Ortiz-Bejar, Jesus
Jacobo, Jaime Cerda
Zamora-Mendez, Alejandro
[J]. 2019 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC 2019), 2019,
[8] A Comparative Study on k-means Clustering Method and Analysis
Baruri, Rajdeep
Ghosh, Anannya
Chanda, Saikat
Banerjee, Ranjan
Das, Anindya
Mandal, Arindam
Halder, Tapas
[J]. EMERGING TECHNOLOGIES IN COMPUTER ENGINEERING: MICROSERVICES IN BIG DATA ANALYTICS, 2019, 985 : 113 - 127
[9] Comparative Study of K-means and Mini Batch K-means Clustering Algorithms in Android Malware Detection Using Network Traffic Analysis
Feizollah, Ali
Anuar, Nor Badrul
Salleh, Rosli
Amalina, Fairuz
[J]. 2014 INTERNATIONAL SYMPOSIUM ON BIOMETRICS AND SECURITY TECHNOLOGIES (ISBAST), 2014, : 193 - 197
[10] A Survey on Various K-Means algorithms for Clustering
Singh, Malwinder
Bansal, Meenakshi
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 60 - 65

← 1 2 3 4 5 →