Sublinear-time approximation algorithms for clustering via random sampling

被引:25
|
作者
Czumaj, Artur [1 ]
Sohler, Christian
机构
[1] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
[2] Univ Gesamthsch Paderborn, Heinz Nixdorf Inst, D-33102 Paderborn, Germany
[3] Univ Gesamthsch Paderborn, Dept Comp Sci, D-33102 Paderborn, Germany
关键词
clustering; k-median; k-means; min-sum clustering; random sampling;
D O I
10.1002/rsa.20157
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a novel analysis of a random sampling approach for four clustering problems in metric spaces: k-median, k-means, min-sum k-clustering, and balanced k-median. For all these problems, we consider the following simple sampling scheme: select a small sample set of input points uniformly at random and then run some approximation algorithm on this sample set to compute an approximation of the best possible clustering of this set. Our main technical contribution is a significantly strengthened analysis of the approximation guarantee by this scheme for the clustering problems. The main motivation behind our analyses was to design sublinear-time algorithms for clustering problems. Our second contribution is the development of new approximation algorithms for the aforementioned clustering problems. Using our random sampling approach, we obtain for these problems the first time approximation algorithms that have running time independent of the input size, and depending on k and the diameter of the metric space only. (c) 2006 Wiley Periodicals, Inc.
引用
收藏
页码:226 / 256
页数:31
相关论文
共 50 条
  • [1] Sublinear-time approximation for clustering via random sampling
    Czumaj, A
    Sohler, C
    [J]. AUTOMATA , LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2004, 3142 : 396 - 407
  • [2] Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling
    Aliakbarpour, Maryam
    Biswas, Amartya Shankha
    Gouleakis, Themis
    Peebles, John
    Rubinfeld, Ronitt
    Yodpinyanee, Anak
    [J]. ALGORITHMICA, 2018, 80 (02) : 668 - 697
  • [3] Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling
    Maryam Aliakbarpour
    Amartya Shankha Biswas
    Themis Gouleakis
    John Peebles
    Ronitt Rubinfeld
    Anak Yodpinyanee
    [J]. Algorithmica, 2018, 80 : 668 - 697
  • [4] Sublinear-time Algorithms
    Czumaj, Artur
    Sohler, Christian
    [J]. PROPERTY TESTING: CURRENT RESEARCH AND SURVEYS, 2010, 6390 : 41 - +
  • [5] SUBLINEAR-TIME ALGORITHMS
    Woeginger, Gerhard J.
    Czumaj, Artur
    Sohler, Christian
    [J]. BULLETIN OF THE EUROPEAN ASSOCIATION FOR THEORETICAL COMPUTER SCIENCE, 2006, (89): : 23 - 47
  • [6] Improved approximation guarantees for sublinear-time Fourier algorithms
    Iwen, Mark A.
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2013, 34 (01) : 57 - 82
  • [7] Sublinear Time Eigenvalue Approximation via Random Sampling
    Bhattacharjee, Rajarshi
    Dexter, Gregory
    Drineas, Petros
    Musco, Cameron
    Ray, Archan
    [J]. ALGORITHMICA, 2024, 86 (06) : 1764 - 1829
  • [8] Sublinear Time Eigenvalue Approximation via Random Sampling
    Bhattacharjee, Rajarshi
    Dexter, Gregory
    Drineas, Petros
    Musco, Cameron
    Ray, Archan
    [J]. arXiv, 2021,
  • [9] On derandomizing probabilistic sublinear-time algorithms
    Zimand, Marius
    [J]. TWENTY-SECOND ANNUAL IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 2007, : 1 - +
  • [10] Sublinear-time algorithms for tournament graphs
    Dantchev, Stefan
    Friedetzky, Tom
    Nagel, Lars
    [J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2011, 22 (03) : 469 - 481