RNN-DBSCAN: A Density-Based Clustering Algorithm Using Reverse Nearest Neighbor Density Estimates

被引:195
|
作者
Bryant, Avory [1 ,2 ]
Cios, Krzysztof [3 ,4 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Med Coll Virginia Campus, Richmond, VA 23284 USA
[2] Naval Surface Warfare Ctr Dahlgren Div, Dahlgren, VA 22448 USA
[3] Virginia Commonwealth Univ, Comp Sci Dept, Med Coll Virginia Campus, Richmond, VA 23284 USA
[4] Polish Acad Sci, PL-20290 Lublin, Poland
关键词
Unsupervised learning; pattern analysis; clustering algorithms; pattern clustering; density estimation robust algorithm; nearest neighbor searches;
D O I
10.1109/TKDE.2017.2787640
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new density-based clustering algorithm, RNN-DBSCAN, is presented which uses reverse nearest neighbor counts as an estimate of observation density. Clustering is performed using a DBSCAN-like approach based on k nearest neighbor graph traversals through dense observations. RNN-DBSCAN is preferable to the popular density-based clustering algorithm DBSCAN in two aspects. First, problem complexity is reduced to the use of a single parameter (choice of k nearest neighbors), and second, an improved ability for handling large variations in cluster density (heterogeneous density). The superiority of RNN-DBSCAN is demonstrated on several artificial and real-world datasets with respect to prior work on reverse nearest neighbor based clustering approaches (RECORD, IS-DBSCAN, and ISB-DBSCAN) along with DBSCAN and OPTICS. Each of these clustering approaches is described by a common graph-based interpretation wherein clusters of dense observations are defined as connected components, along with a discussion on their computational complexity. Heuristics for RNN-DBSCAN parameter selection are presented, and the effects of k on RNN-DBSCAN clusterings discussed. Additionally, with respect to scalability, an approximate version of RNN-DBSCAN is presented leveraging an existing approximate k nearest neighbor technique.
引用
收藏
页码:1109 / 1121
页数:13
相关论文
共 50 条
  • [1] KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space
    Hu, Lihua
    Liu, Hongkai
    Zhang, Jifu
    Liu, Aiqin
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [2] A novel density-based clustering algorithm using nearest neighbor graph
    Li, Hao
    Liu, Xiaojie
    Li, Tao
    Gan, Rundong
    PATTERN RECOGNITION, 2020, 102
  • [3] GNN-DBSCAN: A new density-based algorithm using grid and the nearest neighbor
    Li Yihong
    Wang Yunpeng
    Li Tao
    Lan Xiaolong
    Song Han
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (06) : 7589 - 7601
  • [4] Incremental Shared Nearest Neighbor Density-Based Clustering
    Singh, Sumeet
    Awekar, Amit
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1533 - 1536
  • [5] Nearest neighbor - density-based clustering methods for large hyperspectral images
    Cariou, Claude
    Chehdi, Kacem
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXIII, 2017, 10427
  • [6] A dynamic density-based clustering method based on K-nearest neighbor
    Sorkhi, Mahshid Asghari
    Akbari, Ebrahim
    Rabbani, Mohsen
    Motameni, Homayun
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (05) : 3005 - 3031
  • [7] A dynamic density-based clustering method based on K-nearest neighbor
    Mahshid Asghari Sorkhi
    Ebrahim Akbari
    Mohsen Rabbani
    Homayun Motameni
    Knowledge and Information Systems, 2024, 66 : 3005 - 3031
  • [8] dbscan: Fast Density-Based Clustering with R
    Hahsler, Michael
    Piekenbrock, Matthew
    Doran, Derek
    JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (01): : 1 - 30
  • [9] NS-DBSCAN: A Density-Based Clustering Algorithm in Network Space
    Wang, Tianfu
    Ren, Chang
    Luo, Yun
    Tian, Jing
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (05)
  • [10] U-DBSCAN : A Density-Based Clustering Algorithm for Uncertain Objects
    Tepwankul, Apinya
    Maneewongwattana, Songrit
    2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 136 - 143