GB-DBSCAN: A fast granular-ball based DBSCAN clustering algorithm

被引:4
|
作者
Cheng, Dongdong [1 ,2 ,3 ,4 ]
Zhang, Cheng [2 ,3 ,4 ]
Li, Ya [2 ,3 ,4 ]
Xia, Shuyin [2 ,3 ,4 ]
Wang, Guoyin [2 ,3 ,4 ]
Huang, Jinlong [1 ]
Zhang, Sulan [1 ]
Xie, Jiang [2 ,3 ,4 ]
机构
[1] Yangtze Normal Univ, Coll Big Data & Intelligent Engn, Chongqing 408100, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Comp Intelligence, Chongqing 400065, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Key Lab Cyberspace Big Data Intelligent Secur, Minist Educ, Chongqing 400065, Peoples R China
[4] Chongqing Univ Posts & Telecommun, Key Lab Big Data Intelligent Comp, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
DBSCAN; Granular-ball; KNN; Clustering;
D O I
10.1016/j.ins.2024.120731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Density-Based Spatial Clustering of Applications with Noise (DBSCAN) identifies high-density connected areas as clusters, so that it has advantages in discovering arbitrary-shaped clusters. However, it has difficulty in adjusting parameters and since it needs to scan all data points in turn, its time complexity is O(n2). Granular-ball (GB) is a coarse grained representation of data. It is on the basis of the assumption that an object and its local neighbors have similar distribution and they have high possibility of belonging to the same class. It has been introduced into supervised learning by Xia et al. to improve the efficiency of supervised learning. Inspired by the idea of granular-ball, we introduce it into unsupervised learning and use it to improve the efficiency of DBSCAN, called GB-DBSCAN. The main idea of the proposed algorithm GB-DBSCAN is to employ granular-ball to represent a set of data points and then clustering on granular-balls, instead of the data points. Firstly, we use k-nearest neighbors (KNN) to generate granular-balls, which is a bottom-up strategy, and describe granular-balls according to their centers and radius. Then, the granular-balls are divided into Core-GBs and Non-Core-GBs according to their density. After that, the Core-GBs are merged into clusters according to the idea of DBSCAN and the Non-Core-GBs are assigned to the appropriate clusters. Since the granular-balls' number is much smaller than the size of the objects in a dataset, the running time of DBSCAN is greatly reduced. By comparing with KNN-BLOCK DBSCAN, RNN-DBSCAN, DBSCAN, K-means, DP and SNN-DPC algorithms, the proposed algorithm can get similar or even better clustering result in much less running time.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] An Efficient Spectral Clustering Algorithm Based on Granular-Ball
    Xie, Jiang
    Kong, Weiyu
    Xia, Shuyin
    Wang, Guoyin
    Gao, Xinbo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 9743 - 9753
  • [2] A Clustering Algorithm Based on FDP and DBSCAN
    Wang, Zhihe
    Huang, Mengying
    Du, Hui
    Qin, Hongwu
    2018 14TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2018, : 145 - 149
  • [3] A Fast Multiscale Clustering Approach Based on DBSCAN
    Chen, Runzi
    Zhao, Shuliang
    Liang, Meishe
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [4] An Extended DBSCAN Clustering Algorithm
    Fahim, Ahmed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (03) : 245 - 258
  • [5] Improved DBSCAN clustering algorithm
    Feng, Shao-Rong
    Xiao, Wen-Jun
    Zhongguo Kuangye Daxue Xuebao/Journal of China University of Mining and Technology, 2008, 37 (01): : 105 - 111
  • [6] Dboost: A Fast Algorithm for DBSCAN-based Clustering on High Dimensional Data
    Zhang, Yuxiao
    Wang, Xiaorong
    Li, Bingyang
    Chen, Wei
    Wang, Tengjiao
    Lei, Kai
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 245 - 256
  • [7] FDBSCAN: A fast DBSCAN algorithm
    Zhou, Shuigeng, 2000, (11):
  • [8] dbscan: Fast Density-Based Clustering with R
    Hahsler, Michael
    Piekenbrock, Matthew
    Doran, Derek
    JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (01): : 1 - 30
  • [9] Online structural clustering based on DBSCAN extension with granular descriptors
    Ouyang, Tinghui
    Shen, Xun
    INFORMATION SCIENCES, 2022, 607 : 688 - 704
  • [10] Revised DBSCAN Clustering Algorithm Based on Dual Grid
    Zhu, Qidan
    Tang, Xiangmeng
    Liu, Zhilin
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 3461 - 3466