Fast Parameterless Density-Based Clustering via Random Projections

被引:26
|
作者
Schneider, Johannes [1 ]
Vlachos, Michail [1 ]
机构
[1] IBM Res Zurich, Zurich, Switzerland
关键词
Clustering; Data Mining; Random Projections;
D O I
10.1145/2505515.2505590
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering offers significant insights in data analysis. Density-based algorithms have emerged as flexible and efficient techniques, able to discover high-quality -and potentially irregularly shaped- clusters. We present two fast density-based clustering algorithms based on random projections. Both algorithms demonstrate one to two orders of magnitude speedup compared to equivalent state-of-art density based techniques, even for modest-size datasets. We give a comprehensive analysis of both our algorithms and show runtime of O(dN log(2) N), for a d-dimensional dataset. Our first algorithm can be viewed as a fast variant of the OPTICS density-based algorithm, but using a softer definition of density combined with sampling. The second algorithm is parameter-less, and identifies areas separating clusters.
引用
收藏
页码:861 / 866
页数:6
相关论文
共 50 条
  • [41] Density-based clustering of social networks
    Menardi, Giovanna
    De Stefano, Domenico
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2022, 185 (03) : 1004 - 1029
  • [42] Scalable density-based distributed clustering
    Januzaj, E
    Kriegel, HP
    Pfeifle, M
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 231 - 244
  • [43] A Fast and More Accurate Seed-and-Extension Density-Based Clustering Algorithm
    Tung, Ming-Hao
    Chen, Yi-Ping Phoebe
    Liu, Chen-Yu
    Liao, Chung-Shou
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5458 - 5471
  • [44] A Fast Algorithm for Identifying Density-Based Clustering Structures Using a Constraint Graph
    Kim, Jeong-Hun
    Choi, Jong-Hyeok
    Yoo, Kwan-Hee
    Loh, Woong-Kee
    Nasridinov, Aziz
    ELECTRONICS, 2019, 8 (10)
  • [45] DBHD: Density-based clustering for highly varying density
    Durani, Walid
    Mautz, Dominik
    Plant, Claudia
    Boehm, Christian
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 921 - 926
  • [46] ClusterMPP: An unsupervised density-based clustering algorithm via Marked Point Process
    Henni, Khadidja
    Alata, Olivier
    Zaoui, Lynda
    Vannier, Brigitte
    El Idrissi, Abdellatif
    Moussa, Ahmed
    INTELLIGENT DATA ANALYSIS, 2017, 21 (04) : 827 - 847
  • [47] Multiscale PMU Data Compression via Density-Based WAMS Clustering Analysis
    Lee, Gyul
    Kim, Do-In
    Kim, Seon Hyeog
    Shin, Yong-June
    ENERGIES, 2019, 12 (04)
  • [48] Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis
    Guo, Wei
    Tondi, Benedetta
    Barni, Mauro
    IEEE Transactions on Information Forensics and Security, 2024, 19 : 970 - 984
  • [49] Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis
    Guo, Wei
    Tondi, Benedetta
    Barni, Mauro
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 970 - 984
  • [50] Better than the best? Answers via model ensemble in density-based clustering
    Alessandro Casa
    Luca Scrucca
    Giovanna Menardi
    Advances in Data Analysis and Classification, 2021, 15 : 599 - 623