Quantization/Clustering: when and why does k-means work?

被引:0
|
作者
Levrard, Clement [1 ]
机构
[1] Univ Paris Diderot, LPMA, 8 Pl Aure Lie Nemours, F-75013 Paris, France
来源
JOURNAL OF THE SFDS | 2018年 / 159卷 / 01期
关键词
k-means; clustering; quantization; separation rate; distortion;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Though mostly used as a clustering algorithm, k-means is originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon Levrard (2015); Tang and Monteleoni (2016a), we try to investigate how and when these two approaches are compatible. Namely, we show that provided the sample distribution satisfies a margin like condition (in the sense of Mammen and Tsybakov, 1999 for supervised learning), both the associated empirical risk minimizer and the output of Lloyd's algorithm provide almost optimal classification in certain cases (in the sense of Azizyan et al., 2013). Besides, we also show that they achieved fast and optimal convergence rates in terms of sample size and compression risk.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [21] Subspace K-means clustering
    Timmerman, Marieke E.
    Ceulemans, Eva
    De Roover, Kim
    Van Leeuwen, Karla
    BEHAVIOR RESEARCH METHODS, 2013, 45 (04) : 1011 - 1023
  • [22] Spherical k-Means Clustering
    Hornik, Kurt
    Feinerer, Ingo
    Kober, Martin
    Buchta, Christian
    JOURNAL OF STATISTICAL SOFTWARE, 2012, 50 (10): : 1 - 22
  • [23] K-Means Clustering Explained
    Emerson, Robert Wall
    JOURNAL OF VISUAL IMPAIRMENT & BLINDNESS, 2024, 118 (01) : 65 - 66
  • [24] Power k-Means Clustering
    Xu, Jason
    Lange, Kenneth
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [25] Subspace K-means clustering
    Marieke E. Timmerman
    Eva Ceulemans
    Kim De Roover
    Karla Van Leeuwen
    Behavior Research Methods, 2013, 45 : 1011 - 1023
  • [26] k-means clustering of extremes
    Janssen, Anja
    Wan, Phyllis
    ELECTRONIC JOURNAL OF STATISTICS, 2020, 14 (01): : 1211 - 1233
  • [27] K-means clustering on CGRA
    Lopes, Joao D.
    de Sousa, Jose T.
    Neto, Horacio
    Vestias, Mario
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [28] Online k-means Clustering
    Cohen-Addad, Vincent
    Guedj, Benjamin
    Kanade, Varun
    Rom, Guy
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [29] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
  • [30] Deep k-Means: Jointly clustering with k-Means and learning representations
    Fard, Maziar Moradi
    Thonet, Thibaut
    Gaussier, Eric
    PATTERN RECOGNITION LETTERS, 2020, 138 : 185 - 192