Scaling up kernel grower clustering method for large data sets via core-sets

被引：0

作者：

Chang, Liang ^{[1
]}

Deng, Xiao-Ming ^{[2
,3
]}

Zheng, Sui-Wu ^{[1
]}

Wang, Yong-Qing ^{[1
]}

机构：

[1] Key Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China

[2] Virtual Reality Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China

[3] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China

来源：

Zidonghua Xuebao/Acta Automatica Sinica | 2008年 / 34卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Data mining - Data structures - Image segmentation - Pattern recognition - Self organizing maps;

D O I：

10.3724/SP.J.1004.2008.00376

中图分类号：

学科分类号：

摘要：

Kernel grower is a novel kernel clustering method proposed recently by Camastra and Verri. It shows good performance for various data sets and compares favorably with respect to popular clustering algorithms. However, the main drawback of the method is the weak scaling ability in dealing with large data sets, which restricts its application greatly. In this paper, we propose a scaled-up kernel grower method using core-sets, which is significantly faster than the original method for large data clustering. Meanwhile, it can deal with very large data sets. Numerical experiments on benchmark data sets as well as synthetic data sets show the efficiency of the proposed method. The method is also applied to real image segmentation to illustrate its performance.

引用

页码：376 / 382

共 50 条

[31] CLUSTERING OF LARGE DATA SETS - ZUPAN,J
WHITE, M
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1983, 78 (383) : 733 - 734
[32] RAPID INITIAL CLUSTERING OF LARGE DATA SETS
GAUCH, HG
VEGETATIO, 1980, 42 (1-3): : 103 - 111
[33] Approximate pairwise clustering for large data sets via sampling plus extension
Wang, Liang
Leckie, Christopher
Kotagiri, Ramamohanarao
Bezdek, James
PATTERN RECOGNITION, 2011, 44 (02) : 222 - 235
[34] Brief Announcement: Scalable Diversity Maximization via Small-size Composable Core-sets
Epasto, Alessandro
Mirrokni, Vahab
Zadimoghaddam, Morteza
SPAA'19: PROCEEDINGS OF THE 31ST ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURESS, 2019, 2019, : 41 - 42
[35] Scaling clustering algorithms for massive data sets using data streams
Nittel, S
Leung, KT
Braverman, A
20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 830 - 830
[36] Sparse Kernel Clustering of Massive High-Dimensional Data sets with Large Number of Clusters
Chitta, Radha
Jain, Anil K.
Jin, Rong
PIKM'15: PROCEEDINGS OF THE 8TH PH.D. WORKSHOP IN INFORMATION AND KNOWLEDGE MANAGEMENT, 2015, : 11 - 18
[37] Method of particles in visual clustering of multi-dimensional and large data sets
Dzwinel, W
Blasiak, J
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE, 1999, 15 (03): : 365 - 379
[38] An Online Weighted Bayesian Fuzzy Clustering Method for Large Medical Data Sets
Zhang, Cong
Xue, Jing
Gu, Xiaoqing
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[39] A genetic algorithm for clustering on very large data sets
Gasvoda, J
Ding, Q
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
[40] A Genetic Algorithm Approach for Clustering Large Data Sets
Luchi, Diego
Rodrigues, Alexandre
Varejao, Flavio Miguel
Santos, Willian
2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 570 - 576

← 1 2 3 4 5 →