Efficient online spherical K-means clustering

被引:0
|
作者
Zhong, S [1 ]
机构
[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spherical k-means algorithm, i.e., the k-means algorithm with cosine similarity, is a popular method for clustering high-dimensional text data. In this algorithm, each document as well as each cluster mean is represented as a high-dimensional unit-length vector. However, it has been mainly used in batch mode. That is, each cluster mean vector is updated, only after all document vectors being assigned, as the (normalized) average of all the document vectors assigned to that cluster. This paper investigates an online version of the spherical k-means algorithm based on the well-known Winner-Take-All competitive learning. In this online algorithm, each cluster centroid is incrementally updated given a document. We demonstrate that the online spherical k-means algorithm can achieve significantly better clustering results than the batch version, especially when an annealing-type learning rate schedule is used. We also present heuristics to improve the speed, yet almost without loss of clustering quality.
引用
收藏
页码:3180 / 3185
页数:6
相关论文
共 50 条
  • [31] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [32] Identifying Online Opinion Leaders Using K-means Clustering
    Hudli, Shrihari A.
    Hudli, Aditi A.
    Hudli, Anand V.
    [J]. 2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 416 - 419
  • [33] Online K-Means Clustering With Adaptive Dual Cost Functions
    Sharma, Priti
    Sharma, Amit
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 793 - 799
  • [34] Unexpected Effects of Online no-Substitution k-means Clustering
    Moshkovitz, Michal
    [J]. ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [35] Improved K-means algorithm for clustering non-spherical data
    He, Honglei
    He, Yuxuan
    Wang, Fang
    Zhu, Wenming
    [J]. EXPERT SYSTEMS, 2022, 39 (09)
  • [36] K-means - a fast and efficient K-means algorithms
    Nguyen C.D.
    Duong T.H.
    [J]. Nguyen, Cuong Duc (nguyenduccuong@tdt.edu.vn), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (11) : 27 - 45
  • [37] Accelerating Spherical k-Means
    Schubert, Erich
    Lang, Andreas
    Feher, Gloria
    [J]. SIMILARITY SEARCH AND APPLICATIONS, SISAP 2021, 2021, 13058 : 217 - 231
  • [38] Selection of K in K-means clustering
    Pham, DT
    Dimov, SS
    Nguyen, CD
    [J]. PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2005, 219 (01) : 103 - 119
  • [39] STiMR k-Means: An Efficient Clustering Method for Big Data
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (08)
  • [40] An Optimized Algorithm For Efficient Problem Solving In K-MEANS Clustering
    Qureshi, Salim Raza
    Mehta, Sunali
    Gupta, Chaahat
    [J]. 2017 INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING AND INFORMATION SYSTEMS (ICNGCIS), 2017, : 86 - 91