Quantization/Clustering: when and why does k-means work?

被引:0
|
作者
Levrard, Clement [1 ]
机构
[1] Univ Paris Diderot, LPMA, 8 Pl Aure Lie Nemours, F-75013 Paris, France
来源
JOURNAL OF THE SFDS | 2018年 / 159卷 / 01期
关键词
k-means; clustering; quantization; separation rate; distortion;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Though mostly used as a clustering algorithm, k-means is originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon Levrard (2015); Tang and Monteleoni (2016a), we try to investigate how and when these two approaches are compatible. Namely, we show that provided the sample distribution satisfies a margin like condition (in the sense of Mammen and Tsybakov, 1999 for supervised learning), both the associated empirical risk minimizer and the output of Lloyd's algorithm provide almost optimal classification in certain cases (in the sense of Azizyan et al., 2013). Besides, we also show that they achieved fast and optimal convergence rates in terms of sample size and compression risk.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [41] Granular K-means Clustering Algorithm
    Zhou, Chenglong
    Chen, Yuming
    Zhu, Yidong
    Computer Engineering and Applications, 2023, 59 (13) : 317 - 324
  • [42] Unsupervised K-Means Clustering Algorithm
    Sinaga, Kristina P.
    Yang, Miin-Shen
    IEEE ACCESS, 2020, 8 : 80716 - 80727
  • [43] Dynamic Incremental K-means Clustering
    Aaron, Bryant
    Tamir, Dan E.
    Rishe, Naphtali D.
    Kandel, Abraham
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), VOL 1, 2014, : 308 - 313
  • [44] Locally Private k-Means Clustering
    Stemmer, Uri
    PROCEEDINGS OF THE THIRTY-FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA'20), 2020, : 548 - 559
  • [45] APPLICATION OF METAHEURISTICS TO K-MEANS CLUSTERING
    Lisin, A. V.
    Faizullin, R. T.
    COMPUTER OPTICS, 2015, 39 (03) : 406 - 412
  • [46] STRONG CONSISTENCY OF K-MEANS CLUSTERING
    POLLARD, D
    ANNALS OF STATISTICS, 1981, 9 (01): : 135 - 140
  • [47] Sparse Embedded k-Means Clustering
    Liu, Weiwei
    Shen, Xiaobo
    Tsang, Ivor W.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [48] Selective inference for k-means clustering
    Chen, Yiqun T.
    Witten, Daniela M.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [49] Locality Sensitive K-means Clustering
    Liu, Chlen-Liang
    Hsai, Wen-Hoar
    Chang, Tao-Hsing
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2018, 34 (01) : 289 - 305
  • [50] Modified k-Means Clustering Algorithm
    Patel, Vaishali R.
    Mehta, Rupa G.
    COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2011, 250 : 307 - +