Rough-DBSCAN: A fast hybrid density based clustering method for large data sets

被引:103
|
作者
Viswanath, P. [2 ]
Babu, V. Suresh [1 ]
机构
[1] Univ Bedfordshire, Inst Res Applicable Comp, Dept Comp & Informat Syst, Luton LU1 3JU, Beds, England
[2] NRI Inst Technol, Pattern Recognit Res Lab, Dept Comp Sci & Engn, Guntur 522009, Andhra Pradesh, India
关键词
Clustering; Density based clustering; DBSCAN; Leaders; Rough sets;
D O I
10.1016/j.patrec.2009.08.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Density based clustering techniques like DBSCAN are attractive because it can find arbitrary shaped clusters along with noisy outliers. Its time requirement is O(n(2)) where n is the size of the dataset, and because of this it is not a suitable one to work with large datasets. A solution proposed in the paper is to apply the leaders clustering method first to derive the prototypes called leaders from the dataset which along with prototypes preserves the density information also, then to use these leaders to derive the density based clusters. The proposed hybrid clustering technique called rough-DBSCAN has a time complexity of O(n) only and is analyzed using rough set theory. Experimental studies are done using both synthetic and real world datasets to compare rough-DBSCAN with DBSCAN. It is shown that for large datasets rough-DBSCAN can find a similar clustering as found by the DBSCAN, but is consistently faster than DBSCAN. Also some properties of the leaders as prototypes are formally established. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:1477 / 1488
页数:12
相关论文
共 50 条
  • [41] ADAPTIVE DENSITY-BASED SPATIAL CLUSTERING OF APPLICATIONS WITH NOISE (DBSCAN) ACCORDING TO DATA
    Wang, Wei-Tung
    Wu, Yi-Leh
    Tang, Cheng-Yuan
    Hor, Maw-Kae
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL. 1, 2015, : 445 - 451
  • [42] An Evidence Combination Method based on DBSCAN Clustering
    Yang, Kehua
    Tan, Tian
    Zhang, Wei
    CMC-COMPUTERS MATERIALS & CONTINUA, 2018, 57 (02): : 269 - 281
  • [43] The Transmission of the Combination of Rough sets and Fuzzy Kohonen Clustering Network Technology Based on GPS Data under Large Data Environment
    Yang, Kun
    Yu, Ruifen
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON MANAGEMENT, EDUCATION, INFORMATION AND CONTROL, 2015, 125 : 1147 - 1150
  • [44] GB-DBSCAN: A fast granular-ball based DBSCAN clustering algorithm
    Cheng, Dongdong
    Zhang, Cheng
    Li, Ya
    Xia, Shuyin
    Wang, Guoyin
    Huang, Jinlong
    Zhang, Sulan
    Xie, Jiang
    INFORMATION SCIENCES, 2024, 674
  • [45] A New Density Based Clustering Algorithm for Binary Data Sets
    Nanda, Satyasai Jagannath
    Raman, Rahul
    Vijay, Shubham
    Bhardwaj, Anil
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [46] Density-based clustering algorithm for mixture data sets
    Huang, De-Cai
    Wu, Tian-Hong
    Kongzhi yu Juece/Control and Decision, 2010, 25 (03): : 416 - 421
  • [47] Data Labeling method based on Rough Entropy for Categorical Data Clustering
    Sreenivasulu, G.
    Raju, S. Viswanadha
    Rao, N. Sambasiva
    2014 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATION AND COMPUTATIONAL ENGINEERING (ICECCE), 2014, : 173 - 178
  • [48] A Fast Density-Grid Based Clustering Method
    Brown, Daniel
    Japa, Arialdis
    Shi, Yong
    2019 IEEE 9TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2019, : 48 - 54
  • [49] Fast spectral clustering for large data sets using minimal enclosing ball
    Qian, Peng-Jiang
    Wang, Shi-Tong
    Deng, Zhao-Hong
    Xu, Hua
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (09): : 2035 - 2041
  • [50] Fast Graph-Based Relaxed Clustering for Large Data Sets Using Minimal Enclosing Ball
    Qian, Pengjiang
    Chung, Fu-Lai
    Wang, Shitong
    Deng, Zhaohong
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 672 - 687