Robust Algorithms for Online k-means Clustering

被引:0
|
作者
Bhaskara, Aditya [1 ]
Ruwanpathirana, Aravinda Kanchana [1 ]
机构
[1] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
来源
关键词
Online algorithms; k-means clustering; Robust algorithms;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the online version of the classic k-means clustering problem, the points of a dataset u(1), u(2),... arrive one after another in an arbitrary order. When the algorithm sees a point, it should either add it to the set of centers, or let go of the point. Once added, a center cannot be removed. The goal is to end up with set of roughly k centers, while competing in k-means objective value with the best set of k centers in hindsight. Online versions of k-means and other clustering problem have received significant attention in the literature. The key idea in many algorithms is that of adaptive sampling: when a new point arrives, it is added to the set of centers with a probability that depends on the distance to the centers chosen so far. Our contributions are as follows: 1. We give a modified adaptive sampling procedure that obtains a better approximation ratio (improving it from logarithmic to constant). 2. Our main result is to show how to perform adaptive sampling when data has outliers (>> k points that are potentially arbitrarily far from the actual data, thus rendering distance-based sampling prone to picking the outliers). 3. We also discuss lower bounds for k-means clustering in an online setting.
引用
收藏
页码:148 / 173
页数:26
相关论文
共 50 条
  • [1] Online k-means Clustering
    Cohen-Addad, Vincent
    Guedj, Benjamin
    Kanade, Varun
    Rom, Guy
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [2] Robust K-Median and K-Means Clustering Algorithms for Incomplete Data
    Li, Jinhua
    Song, Shiji
    Zhang, Yuli
    Zhou, Zhen
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2016, 2016
  • [3] K-means clustering algorithms used in the evaluation of online learners' behaviour
    Chen, Xiaoming
    Li, Wenge
    Jiang, Yubo
    [J]. INTERNATIONAL JOURNAL OF CONTINUING ENGINEERING EDUCATION AND LIFE-LONG LEARNING, 2021, 31 (03) : 394 - 404
  • [4] A Survey on Various K-Means algorithms for Clustering
    Singh, Malwinder
    Bansal, Meenakshi
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 60 - 65
  • [5] The seeding algorithms for spherical k-means clustering
    Min Li
    Dachuan Xu
    Dongmei Zhang
    Juan Zou
    [J]. Journal of Global Optimization, 2020, 76 : 695 - 708
  • [6] Acceleration of K-means and related clustering algorithms
    Phillips, SJ
    [J]. ALGORITHM ENGINEERING AND EXPERIMENTS, 2002, 2409 : 166 - 177
  • [7] The seeding algorithms for spherical k-means clustering
    Li, Min
    Xu, Dachuan
    Zhang, Dongmei
    Zou, Juan
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2020, 76 (04) : 695 - 708
  • [8] Efficient online spherical K-means clustering
    Zhong, S
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 3180 - 3185
  • [9] Robust Embedded Deep K-means Clustering
    Zhang, Rui
    Tong, Hanghang
    Xia, Yinglong
    Zhu, Yada
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1181 - 1190
  • [10] Online K-Means Clustering with Lightweight Coresets
    Low, Jia Shun
    Ghafoori, Zahra
    Leckie, Christopher
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 191 - 202