Efficient online spherical K-means clustering

被引:0
|
作者
Zhong, S [1 ]
机构
[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spherical k-means algorithm, i.e., the k-means algorithm with cosine similarity, is a popular method for clustering high-dimensional text data. In this algorithm, each document as well as each cluster mean is represented as a high-dimensional unit-length vector. However, it has been mainly used in batch mode. That is, each cluster mean vector is updated, only after all document vectors being assigned, as the (normalized) average of all the document vectors assigned to that cluster. This paper investigates an online version of the spherical k-means algorithm based on the well-known Winner-Take-All competitive learning. In this online algorithm, each cluster centroid is incrementally updated given a document. We demonstrate that the online spherical k-means algorithm can achieve significantly better clustering results than the batch version, especially when an annealing-type learning rate schedule is used. We also present heuristics to improve the speed, yet almost without loss of clustering quality.
引用
收藏
页码:3180 / 3185
页数:6
相关论文
共 50 条
  • [1] Efficient Sparse Spherical k-Means for Document Clustering
    Knittel, Johannes
    Koch, Steffen
    Ertl, Thomas
    [J]. PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
  • [2] Spherical k-Means Clustering
    Hornik, Kurt
    Feinerer, Ingo
    Kober, Martin
    Buchta, Christian
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2012, 50 (10): : 1 - 22
  • [3] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    [J]. ALGORITHMS, 2018, 11 (10):
  • [4] Online k-means Clustering
    Cohen-Addad, Vincent
    Guedj, Benjamin
    Kanade, Varun
    Rom, Guy
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [5] Refining spherical K-means for clustering documents
    Peng, Jiming
    Zhu, Jiaping
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 4146 - +
  • [6] The seeding algorithms for spherical k-means clustering
    Min Li
    Dachuan Xu
    Dongmei Zhang
    Juan Zou
    [J]. Journal of Global Optimization, 2020, 76 : 695 - 708
  • [7] The seeding algorithms for spherical k-means clustering
    Li, Min
    Xu, Dachuan
    Zhang, Dongmei
    Zou, Juan
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2020, 76 (04) : 695 - 708
  • [8] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    [J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249
  • [9] Robust Algorithms for Online k-means Clustering
    Bhaskara, Aditya
    Ruwanpathirana, Aravinda Kanchana
    [J]. ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 148 - 173
  • [10] Online K-Means Clustering with Lightweight Coresets
    Low, Jia Shun
    Ghafoori, Zahra
    Leckie, Christopher
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 191 - 202