RNN-DBSCAN: A Density-Based Clustering Algorithm Using Reverse Nearest Neighbor Density Estimates

被引:195
|
作者
Bryant, Avory [1 ,2 ]
Cios, Krzysztof [3 ,4 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Med Coll Virginia Campus, Richmond, VA 23284 USA
[2] Naval Surface Warfare Ctr Dahlgren Div, Dahlgren, VA 22448 USA
[3] Virginia Commonwealth Univ, Comp Sci Dept, Med Coll Virginia Campus, Richmond, VA 23284 USA
[4] Polish Acad Sci, PL-20290 Lublin, Poland
关键词
Unsupervised learning; pattern analysis; clustering algorithms; pattern clustering; density estimation robust algorithm; nearest neighbor searches;
D O I
10.1109/TKDE.2017.2787640
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new density-based clustering algorithm, RNN-DBSCAN, is presented which uses reverse nearest neighbor counts as an estimate of observation density. Clustering is performed using a DBSCAN-like approach based on k nearest neighbor graph traversals through dense observations. RNN-DBSCAN is preferable to the popular density-based clustering algorithm DBSCAN in two aspects. First, problem complexity is reduced to the use of a single parameter (choice of k nearest neighbors), and second, an improved ability for handling large variations in cluster density (heterogeneous density). The superiority of RNN-DBSCAN is demonstrated on several artificial and real-world datasets with respect to prior work on reverse nearest neighbor based clustering approaches (RECORD, IS-DBSCAN, and ISB-DBSCAN) along with DBSCAN and OPTICS. Each of these clustering approaches is described by a common graph-based interpretation wherein clusters of dense observations are defined as connected components, along with a discussion on their computational complexity. Heuristics for RNN-DBSCAN parameter selection are presented, and the effects of k on RNN-DBSCAN clusterings discussed. Additionally, with respect to scalability, an approximate version of RNN-DBSCAN is presented leveraging an existing approximate k nearest neighbor technique.
引用
收藏
页码:1109 / 1121
页数:13
相关论文
共 50 条
  • [41] Batch Incremental Shared Nearest Neighbor Density Based Clustering Algorithm for Dynamic Datasets
    Bhattacharjee, Panthadeep
    Awekar, Amit
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017, 2017, 10193 : 568 - 574
  • [42] A Novel Local Density Hierarchical Clustering Algorithm Based on Reverse Nearest Neighbors
    Liu, Yaohui
    Liu, Dong
    Yu, Fang
    Ma, Zhengming
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [43] Interpretation of stabilization diagrams using density-based clustering algorithm
    Boroschek, Ruben L.
    Bilbao, Joaquin A.
    ENGINEERING STRUCTURES, 2019, 178 : 245 - 257
  • [44] Density-based reverse nearest neighbourhood search in spatial databases
    Nasser Allheeib
    Md. Saiful Islam
    David Taniar
    Zhou Shao
    Muhammad Aamir Cheema
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 4335 - 4346
  • [45] Density-based reverse nearest neighbourhood search in spatial databases
    Allheeib, Nasser
    Islam, Md Saiful
    Taniar, David
    Shao, Zhou
    Cheema, Muhammad Aamir
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (04) : 4335 - 4346
  • [46] Spectral Clustering with Reverse Soft K-Nearest Neighbor Density Estimation
    Kursun, Olcay
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [47] An Efficient Density-Based Algorithm for Data Clustering
    Theljani, Foued
    Laabidi, Kaouther
    Zidi, Salah
    Ksouri, Moufida
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (04)
  • [48] TOBAE: A Density-based Agglomerative Clustering Algorithm
    Shehzad Khalid
    Shahid Razzaq
    Journal of Classification, 2015, 32 : 241 - 267
  • [49] A New Density Clustering Method Using Mutual Nearest Neighbor
    Zhang, Yufang
    Zha, Yongfang
    Li, Lintao
    Xiong, Zhongyang
    WEB AND BIG DATA, APWEB-WAIM 2021, PT I, 2021, 12858 : 487 - 494
  • [50] GrDBSCAN: A Granular Density-Based Clustering Algorithm
    Suchy, Dawid
    Siminski, Krzysztof
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2023, 33 (02) : 297 - 312