Efficient online spherical K-means clustering

被引：0

作者：

Zhong, S ^{[1
]}

机构：

[1] Florida Atlantic Univ, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA

来源：

PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5 | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The spherical k-means algorithm, i.e., the k-means algorithm with cosine similarity, is a popular method for clustering high-dimensional text data. In this algorithm, each document as well as each cluster mean is represented as a high-dimensional unit-length vector. However, it has been mainly used in batch mode. That is, each cluster mean vector is updated, only after all document vectors being assigned, as the (normalized) average of all the document vectors assigned to that cluster. This paper investigates an online version of the spherical k-means algorithm based on the well-known Winner-Take-All competitive learning. In this online algorithm, each cluster centroid is incrementally updated given a document. We demonstrate that the online spherical k-means algorithm can achieve significantly better clustering results than the batch version, especially when an annealing-type learning rate schedule is used. We also present heuristics to improve the speed, yet almost without loss of clustering quality.

引用

页码：3180 / 3185

页数：6

共 50 条

[21] An Efficient K-means Clustering Algorithm on MapReduce
Li, Qiuhong
Wang, Peng
Wang, Wei
Hu, Hao
Li, Zhongsheng
Li, Junxian
[J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I, 2014, 8421 : 357 - 371
[22] Online k-means Clustering on Arbitrary Data Streams
Bhattacharjee, Robi
Imola, Jacob John
Moshkovitz, Michal
Dasgupta, Sanjoy
[J]. INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 204 - 236
[23] A novel approach for initializing the spherical K-means clustering algorithm
Duwairi, Rehab
Abu-Rahmeh, Mohammed
[J]. SIMULATION MODELLING PRACTICE AND THEORY, 2015, 54 : 49 - 63
[24] An efficient K-means clustering algorithm for tall data
Capo, Marco
Perez, Aritz
Lozano, Jose A.
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (03) : 776 - 811
[25] An efficient K-means clustering algorithm for tall data
Marco Capó
Aritz Pérez
Jose A. Lozano
[J]. Data Mining and Knowledge Discovery, 2020, 34 : 776 - 811
[26] MARIGOLD: Efficient k-means Clustering in High Dimensions
Mortensen, Kasper Overgaard
Zardbani, Fatemeh
Haque, Mohammad Ahsanul
Agustsson, Steinn Ymir
Mottin, Davide
Hofmann, Philip
Karras, Panagiotis
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (07): : 1740 - 1748
[27] An efficient k-means clustering algorithm:: Analysis and implementation
Kanungo, T
Mount, DM
Netanyahu, NS
Piatko, CD
Silverman, R
Wu, AY
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (07) : 881 - 892
[28] An effective and efficient hierarchical K-means clustering algorithm
Qi, Jianpeng
Yu, Yanwei
Wang, Lihong
Liu, Jinglei
Wang, Yingjie
[J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2017, 13 (08) : 1 - 17
[29] Efficient image segmentation and implementation of K-means clustering
Deeparani, K.
Sudhakar, P.
[J]. MATERIALS TODAY-PROCEEDINGS, 2021, 45 : 8076 - 8079
[30] An efficient approximation to the K-means clustering for massive data
Capo, Marco
Perez, Aritz
Lozano, Jose A.
[J]. KNOWLEDGE-BASED SYSTEMS, 2017, 117 : 56 - 69

← 1 2 3 4 5 →