Sublinear-time approximation algorithms for clustering via random sampling

被引:25
|
作者
Czumaj, Artur [1 ]
Sohler, Christian
机构
[1] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
[2] Univ Gesamthsch Paderborn, Heinz Nixdorf Inst, D-33102 Paderborn, Germany
[3] Univ Gesamthsch Paderborn, Dept Comp Sci, D-33102 Paderborn, Germany
关键词
clustering; k-median; k-means; min-sum clustering; random sampling;
D O I
10.1002/rsa.20157
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a novel analysis of a random sampling approach for four clustering problems in metric spaces: k-median, k-means, min-sum k-clustering, and balanced k-median. For all these problems, we consider the following simple sampling scheme: select a small sample set of input points uniformly at random and then run some approximation algorithm on this sample set to compute an approximation of the best possible clustering of this set. Our main technical contribution is a significantly strengthened analysis of the approximation guarantee by this scheme for the clustering problems. The main motivation behind our analyses was to design sublinear-time algorithms for clustering problems. Our second contribution is the development of new approximation algorithms for the aforementioned clustering problems. Using our random sampling approach, we obtain for these problems the first time approximation algorithms that have running time independent of the input size, and depending on k and the diameter of the metric space only. (c) 2006 Wiley Periodicals, Inc.
引用
收藏
页码:226 / 256
页数:31
相关论文
共 50 条
  • [31] Sublinear-time distributed algorithms for detecting small cliques and even cycles
    Talya Eden
    Nimrod Fiat
    Orr Fischer
    Fabian Kuhn
    Rotem Oshman
    [J]. Distributed Computing, 2022, 35 : 207 - 234
  • [32] Metric Sublinear Algorithms via Linear Sampling
    Esfandiari, Hossein
    Mitzenmacher, Michael
    [J]. 2018 IEEE 59TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2018, : 11 - 22
  • [33] Sublinear-time algorithms for monomer-dimer systems on bounded degree graphs
    Lelarge, Marc
    Zhou, Hang
    [J]. THEORETICAL COMPUTER SCIENCE, 2014, 548 : 68 - 78
  • [34] Faster Sublinear-Time Edit Distance
    Bringmann, Karl
    Cassis, Alejandro
    Fischer, Nick
    Kociumaka, Tomasz
    [J]. PROCEEDINGS OF THE 2024 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2024, : 3274 - 3301
  • [35] Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs
    Lelarge, Marc
    Zhou, Hang
    [J]. ALGORITHMS AND COMPUTATION, 2013, 8283 : 141 - 151
  • [36] Sublinear-Time Computation in the Presence of Online Erasures
    Kalemaj, Iden
    Raskhodnikova, Sofya
    Varma, Nithin
    [J]. Theory of Computing, 2023, 19 (01): : 1 - 48
  • [37] Sublinear-time reductions for big data computing
    Gao, Xiangyu
    Li, Jianzhong
    Miao, Dongjing
    [J]. THEORETICAL COMPUTER SCIENCE, 2022, 932 : 1 - 12
  • [38] Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions
    Assadi, Sepehr
    Wang, Chen
    [J]. Leibniz International Proceedings in Informatics, LIPIcs, 2022, 215
  • [39] Sublinear-Time Computation in the Presence of Online Erasures
    Kalemaj, Iden
    Raskhodnikova, Sofya
    Varma, Nithin
    [J]. THEORY OF COMPUTING, 2023, 19
  • [40] Sublinear-Time Reductions for Big Data Computing
    Gao, Xiangyu
    Li, Jianzhong
    Miao, Dongjing
    [J]. COMBINATORIAL OPTIMIZATION AND APPLICATIONS, COCOA 2021, 2021, 13135 : 374 - 388