On the quality of k-means clustering based on grouped data

被引：2

作者：

Kaeaerik, Meelis ^{[1
]}

Paerna, Kalev ^{[1
]}

机构：

[1] Univ Tartu, Inst Stat Math, EE-50090 Tartu, Estonia

来源：

JOURNAL OF STATISTICAL PLANNING AND INFERENCE | 2009年 / 139卷 / 11期

关键词：

Grouped data; k-Means; Lloyd's algorithm; Loss-function; Voronoi partitions; QUANTIZATION;

D O I：

10.1016/j.jspi.2009.05.021

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Let us have a probability distribution P (possibly empirical) on the real line R. Consider the problem of finding the k-mean of P. i.e. a set A of at most k points that minimizes given loss-function. It is known that the k-mean can be found using an iterative algorithm by Lloyd [1982. Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 129-136]. However, depending on the complexity of the distribution P. the application of this algorithm can be quite resource-consuming. One possibility to overcome the problem is to group the original data and calculate the k-mean on the basis of the grouped data. As a result, the new k-mean will be biased, and our aim is to measure the loss of the quality of approximation caused by such approach. (C) 2009 Elsevier B.V. All rights reserved.

引用

页码：3836 / 3841

页数：6

共 50 条

[1] A K-means Based Genetic Algorithm for Data Clustering
Pizzuti, Clara
Procopio, Nicola
INTERNATIONAL JOINT CONFERENCE SOCO'16- CISIS'16-ICEUTE'16, 2017, 527 : 211 - 222
[2] Authentication of uncertain data based on k-means clustering
Unver, Levent
Gundem, Taflan I.
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (04) : 2910 - 2928
[3] Clustering of Image Data Using K-Means and Fuzzy K-Means
Rahmani, Md. Khalid Imam
Pal, Naina
Arora, Kamiya
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
[4] K-Means Clustering With Incomplete Data
Wang, Siwei
Li, Miaomiao
Hu, Ning
Zhu, En
Hu, Jingtao
Liu, Xinwang
Yin, Jianping
IEEE ACCESS, 2019, 7 : 69162 - 69171
[5] k-Means Clustering of Asymmetric Data
Olszewski, Dominik
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT I, 2012, 7208 : 243 - 254
[6] Soil data clustering by using K-means and fuzzy K-means algorithm
Hot, Elma
Popovic-Bugarin, Vesna
2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
[7] A Quality Metric for K-Means Clustering Based on Centroid Locations
Thulasidas, Manoj
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 208 - 222
[8] A hierarchical k-means clustering based fingerprint quality classification
Munir, Muhammad Umer
Javed, Muhammad Younus
Khan, Shoab Ahmad
NEUROCOMPUTING, 2012, 85 : 62 - 67
[9] IMPROVEMENT IN K-MEANS CLUSTERING ALGORITHM FOR DATA CLUSTERING
Rajeswari, K.
Acharya, Omkar
Sharma, Mayur
Kopnar, Mahesh
Karandikar, Kiran
1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 367 - 369
[10] The fast clustering algorithm for the big data based on K-means
Xie, Ting
Zhang, Taiping
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (06)

← 1 2 3 4 5 →