Scaling up kernel grower clustering method for large data sets via core-sets

被引:0
|
作者
Chang, Liang [1 ]
Deng, Xiao-Ming [2 ,3 ]
Zheng, Sui-Wu [1 ]
Wang, Yong-Qing [1 ]
机构
[1] Key Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
[2] Virtual Reality Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
[3] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
来源
Zidonghua Xuebao/Acta Automatica Sinica | 2008年 / 34卷 / 03期
基金
中国国家自然科学基金;
关键词
Data mining - Data structures - Image segmentation - Pattern recognition - Self organizing maps;
D O I
10.3724/SP.J.1004.2008.00376
中图分类号
学科分类号
摘要
Kernel grower is a novel kernel clustering method proposed recently by Camastra and Verri. It shows good performance for various data sets and compares favorably with respect to popular clustering algorithms. However, the main drawback of the method is the weak scaling ability in dealing with large data sets, which restricts its application greatly. In this paper, we propose a scaled-up kernel grower method using core-sets, which is significantly faster than the original method for large data clustering. Meanwhile, it can deal with very large data sets. Numerical experiments on benchmark data sets as well as synthetic data sets show the efficiency of the proposed method. The method is also applied to real image segmentation to illustrate its performance.
引用
收藏
页码:376 / 382
相关论文
共 50 条
  • [31] CLUSTERING OF LARGE DATA SETS - ZUPAN,J
    WHITE, M
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1983, 78 (383) : 733 - 734
  • [32] RAPID INITIAL CLUSTERING OF LARGE DATA SETS
    GAUCH, HG
    VEGETATIO, 1980, 42 (1-3): : 103 - 111
  • [33] Approximate pairwise clustering for large data sets via sampling plus extension
    Wang, Liang
    Leckie, Christopher
    Kotagiri, Ramamohanarao
    Bezdek, James
    PATTERN RECOGNITION, 2011, 44 (02) : 222 - 235
  • [34] Brief Announcement: Scalable Diversity Maximization via Small-size Composable Core-sets
    Epasto, Alessandro
    Mirrokni, Vahab
    Zadimoghaddam, Morteza
    SPAA'19: PROCEEDINGS OF THE 31ST ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURESS, 2019, 2019, : 41 - 42
  • [35] Scaling clustering algorithms for massive data sets using data streams
    Nittel, S
    Leung, KT
    Braverman, A
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 830 - 830
  • [36] Sparse Kernel Clustering of Massive High-Dimensional Data sets with Large Number of Clusters
    Chitta, Radha
    Jain, Anil K.
    Jin, Rong
    PIKM'15: PROCEEDINGS OF THE 8TH PH.D. WORKSHOP IN INFORMATION AND KNOWLEDGE MANAGEMENT, 2015, : 11 - 18
  • [37] Method of particles in visual clustering of multi-dimensional and large data sets
    Dzwinel, W
    Blasiak, J
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE, 1999, 15 (03): : 365 - 379
  • [38] An Online Weighted Bayesian Fuzzy Clustering Method for Large Medical Data Sets
    Zhang, Cong
    Xue, Jing
    Gu, Xiaoqing
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [39] A genetic algorithm for clustering on very large data sets
    Gasvoda, J
    Ding, Q
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
  • [40] A Genetic Algorithm Approach for Clustering Large Data Sets
    Luchi, Diego
    Rodrigues, Alexandre
    Varejao, Flavio Miguel
    Santos, Willian
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 570 - 576