Fast Parameterless Density-Based Clustering via Random Projections

被引:26
|
作者
Schneider, Johannes [1 ]
Vlachos, Michail [1 ]
机构
[1] IBM Res Zurich, Zurich, Switzerland
关键词
Clustering; Data Mining; Random Projections;
D O I
10.1145/2505515.2505590
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering offers significant insights in data analysis. Density-based algorithms have emerged as flexible and efficient techniques, able to discover high-quality -and potentially irregularly shaped- clusters. We present two fast density-based clustering algorithms based on random projections. Both algorithms demonstrate one to two orders of magnitude speedup compared to equivalent state-of-art density based techniques, even for modest-size datasets. We give a comprehensive analysis of both our algorithms and show runtime of O(dN log(2) N), for a d-dimensional dataset. Our first algorithm can be viewed as a fast variant of the OPTICS density-based algorithm, but using a softer definition of density combined with sampling. The second algorithm is parameter-less, and identifies areas separating clusters.
引用
收藏
页码:861 / 866
页数:6
相关论文
共 50 条
  • [1] Scalable density-based clustering with quality guarantees using random projections
    Johannes Schneider
    Michail Vlachos
    Data Mining and Knowledge Discovery, 2017, 31 : 972 - 1005
  • [2] Scalable density-based clustering with quality guarantees using random projections
    Schneider, Johannes
    Vlachos, Michail
    DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (04) : 972 - 1005
  • [3] Fast density-based clustering algorithm
    Zhou, Shuigeng
    Zhou, Aoying
    Cao, Jing
    Hu, Yunfa
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (11): : 1287 - 1292
  • [4] Fast Multi-Image Matching via Density-Based Clustering
    Tron, Roberto
    Zhou, Xiaowei
    Esteves, Carlos
    Daniilidis, Kostas
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4077 - 4086
  • [5] Fast density estimation for density-based clustering methods
    Cheng, Difei
    Xu, Ruihang
    Zhang, Bo
    Jin, Ruinan
    NEUROCOMPUTING, 2023, 532 : 170 - 182
  • [6] dbscan: Fast Density-Based Clustering with R
    Hahsler, Michael
    Piekenbrock, Matthew
    Doran, Derek
    JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (01): : 1 - 30
  • [7] A fast density-based clustering algorithm for large databases
    Liu, Bing
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 996 - 1000
  • [8] Density-based clustering
    Campello, Ricardo J. G. B.
    Kroeger, Peer
    Sander, Jorg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (02)
  • [9] Density-based clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Sander, Joerg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) : 231 - 240
  • [10] Energy replenishment optimisation via density-based clustering
    Gu, Xin
    Peng, Jun
    Cheng, Yijun
    Zhang, Xiaoyong
    Liu, Kaiyang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 21 (02) : 271 - 280