Fast Parameterless Density-Based Clustering via Random Projections

被引:26
|
作者
Schneider, Johannes [1 ]
Vlachos, Michail [1 ]
机构
[1] IBM Res Zurich, Zurich, Switzerland
关键词
Clustering; Data Mining; Random Projections;
D O I
10.1145/2505515.2505590
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering offers significant insights in data analysis. Density-based algorithms have emerged as flexible and efficient techniques, able to discover high-quality -and potentially irregularly shaped- clusters. We present two fast density-based clustering algorithms based on random projections. Both algorithms demonstrate one to two orders of magnitude speedup compared to equivalent state-of-art density based techniques, even for modest-size datasets. We give a comprehensive analysis of both our algorithms and show runtime of O(dN log(2) N), for a d-dimensional dataset. Our first algorithm can be viewed as a fast variant of the OPTICS density-based algorithm, but using a softer definition of density combined with sampling. The second algorithm is parameter-less, and identifies areas separating clusters.
引用
收藏
页码:861 / 866
页数:6
相关论文
共 50 条
  • [21] Generalizing Local Density for Density-Based Clustering
    Lin, Jun-Lin
    SYMMETRY-BASEL, 2021, 13 (02): : 1 - 24
  • [22] Density-Based Clustering for Adaptive Density Variation
    Qian, Li
    Plant, Claudia
    Boehm, Christian
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1282 - 1287
  • [23] Novel density-based and hierarchical density-based clustering algorithms for uncertain data
    Zhang, Xianchao
    Liu, Han
    Zhang, Xiaotong
    NEURAL NETWORKS, 2017, 93 : 240 - 255
  • [24] Estimating Number of Speakers via Density-Based Clustering and Classification Decision
    Yang, Junjie
    Guo, Yi
    Yang, Zuyuan
    Yang, Liu
    Xie, Shengli
    IEEE ACCESS, 2019, 7 : 176541 - 176551
  • [25] Achieving k-anonymity via a density-based clustering method
    Zhu, Hua
    Ye, Xiaojun
    ADVANCES IN DATA AND WEB MANAGEMENT, PROCEEDINGS, 2007, 4505 : 745 - +
  • [26] Practical and Privacy-Preserving Density-Based Clustering via Shuffling
    Wang, Yingzhe
    Li, Hongwei
    Chen, Hanxiao
    Zhang, Xilin
    Hao, Meng
    Proceedings - IEEE Global Communications Conference, GLOBECOM, 2023, : 50 - 55
  • [27] OLAP over Continuous Domains via Density-Based Hierarchical Clustering
    Ceci, Michelangelo
    Cuzzocrea, Alfredo
    Malerba, Donato
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 559 - 570
  • [28] Practical and Privacy-Preserving Density-Based Clustering via Shuffling
    Wang, Yingzhe
    Li, Hongwei
    Chen, Hanxiao
    Zhang, Xilin
    Hao, Meng
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 50 - 55
  • [29] An improved method for density-based clustering
    Jin, Hong
    Wang, Shuliang
    Zhou, Qian
    Li, Ying
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (04) : 347 - 368
  • [30] FULLY ADAPTIVE DENSITY-BASED CLUSTERING
    Steinwart, Ingo
    ANNALS OF STATISTICS, 2015, 43 (05): : 2132 - 2167