k-Center Clustering with Outliers in Sliding Windows

被引:3
|
作者
Pellizzoni, Paolo [1 ]
Pietracaprina, Andrea [1 ]
Pucci, Geppino [1 ]
机构
[1] Univ Padua, Dept Informat Engn, Via Gradenigo 6-B, I-35131 Padua, Italy
关键词
k-center with outliers; effective diameter; big data; data stream model; sliding windows; coreset; doubling dimension; approximation algorithms; ALGORITHMS;
D O I
10.3390/a15020052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Metric k-center clustering is a fundamental unsupervised learning primitive. Although widely used, this primitive is heavily affected by noise in the data, so a more sensible variant seeks for the best solution that disregards a given number z of points of the dataset, which are called outliers. We provide efficient algorithms for this important variant in the streaming model under the sliding window setting, where, at each time step, the dataset to be clustered is the window W of the most recent data items. For general metric spaces, our algorithms achieve O1 approximation and, remarkably, require a working memory linear in k+z and only logarithmic in |W|. For spaces of bounded doubling dimension, the approximation can be made arbitrarily close to 3. For these latter spaces, we show, as a by-product, how to estimate the effective diameter of the window W, which is a measure of the spread of the window points, disregarding a given fraction of noisy distances. We also provide experimental evidence of the practical viability of the improved clustering and diameter estimation algorithms.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Red-Blue k-Center Clustering with Distance Constraints
    Eskandari, Marzieh
    Khare, Bhavika B.
    Kumar, Nirman
    Bigham, Bahram Sadeghi
    MATHEMATICS, 2023, 11 (03)
  • [42] KFC: A Scalable Approximation Algorithm for k-center Fair Clustering
    Harb, Elfarouk
    Shan, Lam Ho
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [43] Fully Dynamic k-Center Clustering in Low Dimensional Metrics
    Goranci, Gramoz
    Henzinger, Monika
    Leniowski, Dariusz
    Schulz, Christian
    Svozil, Alexander
    2021 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX, 2021, : 143 - 153
  • [44] A New Notion of Individually Fair Clustering: α-Equitable k-Center
    Chakrabarti, Darshan
    Dickerson, John P.
    Esmaeili, Seyed A.
    Srinivasan, Aravind
    Tsepenekas, Leonidas
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 6387 - 6408
  • [45] Optimal Fully Dynamic k-Center Clustering for Adaptive and Oblivious
    Bateni, MohammadHossein
    Esfandiari, Hossein
    Fichtenberger, Hendrik
    Henzinger, Monika
    Jayaram, Rajesh
    Mirrokni, Vahab
    Wiese, Andreas
    PROCEEDINGS OF THE 2023 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2023, : 2677 - 2727
  • [46] Fully Dynamic k-Center Clustering With Improved Memory Efficiency
    Chan, T-H Hubert
    Guerquin, Arnaud
    Hu, Shuguang
    Sozio, Mauro
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (07) : 3255 - 3266
  • [47] A Pairwise Fair and Community-preserving Approach to k-Center Clustering
    Brubach, Brian
    Chakrabarti, Darshan
    Dickerson, John P.
    Khuller, Samir
    Srinivasan, Aravind
    Tsepenekas, Leonidas
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [48] Research of a GA-based Clustering K-Center Choosing Algorithm
    Yang, Wenchuan
    Liu, Jie
    Chen, Ningjun
    ADVANCED BUILDING MATERIALS AND STRUCTURAL ENGINEERING, 2012, 461 : 360 - 364
  • [49] Parameterized Approximation Algorithms and Lower Bounds for k-Center Clustering and Variants
    Bandyapadhyay, Sayan
    Friggstad, Zachary
    Mousavi, Ramin
    ALGORITHMICA, 2024, 86 (08) : 2557 - 2574
  • [50] A Pairwise Fair and Community-preserving Approach to k-Center Clustering
    Brubach, Brian
    Chakrabarti, Darshan
    Dickerson, John P.
    Khuller, Samir
    Srinivasan, Aravind
    Tsepenekas, Leonidas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119