GB-DBSCAN: A fast granular-ball based DBSCAN clustering algorithm

被引:4
|
作者
Cheng, Dongdong [1 ,2 ,3 ,4 ]
Zhang, Cheng [2 ,3 ,4 ]
Li, Ya [2 ,3 ,4 ]
Xia, Shuyin [2 ,3 ,4 ]
Wang, Guoyin [2 ,3 ,4 ]
Huang, Jinlong [1 ]
Zhang, Sulan [1 ]
Xie, Jiang [2 ,3 ,4 ]
机构
[1] Yangtze Normal Univ, Coll Big Data & Intelligent Engn, Chongqing 408100, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Comp Intelligence, Chongqing 400065, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Key Lab Cyberspace Big Data Intelligent Secur, Minist Educ, Chongqing 400065, Peoples R China
[4] Chongqing Univ Posts & Telecommun, Key Lab Big Data Intelligent Comp, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
DBSCAN; Granular-ball; KNN; Clustering;
D O I
10.1016/j.ins.2024.120731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Density-Based Spatial Clustering of Applications with Noise (DBSCAN) identifies high-density connected areas as clusters, so that it has advantages in discovering arbitrary-shaped clusters. However, it has difficulty in adjusting parameters and since it needs to scan all data points in turn, its time complexity is O(n2). Granular-ball (GB) is a coarse grained representation of data. It is on the basis of the assumption that an object and its local neighbors have similar distribution and they have high possibility of belonging to the same class. It has been introduced into supervised learning by Xia et al. to improve the efficiency of supervised learning. Inspired by the idea of granular-ball, we introduce it into unsupervised learning and use it to improve the efficiency of DBSCAN, called GB-DBSCAN. The main idea of the proposed algorithm GB-DBSCAN is to employ granular-ball to represent a set of data points and then clustering on granular-balls, instead of the data points. Firstly, we use k-nearest neighbors (KNN) to generate granular-balls, which is a bottom-up strategy, and describe granular-balls according to their centers and radius. Then, the granular-balls are divided into Core-GBs and Non-Core-GBs according to their density. After that, the Core-GBs are merged into clusters according to the idea of DBSCAN and the Non-Core-GBs are assigned to the appropriate clusters. Since the granular-balls' number is much smaller than the size of the objects in a dataset, the running time of DBSCAN is greatly reduced. By comparing with KNN-BLOCK DBSCAN, RNN-DBSCAN, DBSCAN, K-means, DP and SNN-DPC algorithms, the proposed algorithm can get similar or even better clustering result in much less running time.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Implementation of the Objective Clustering Inductive Technology Based on DBSCAN Clustering Algorithm
    Babichev, S.
    Lytvynenko, V.
    Osypenko, V.
    PROCEEDINGS OF THE 2017 12TH INTERNATIONAL SCIENTIFIC AND TECHNICAL CONFERENCE ON COMPUTER SCIENCES AND INFORMATION TECHNOLOGIES (CSIT 2017), VOL. 1, 2017, : 479 - 484
  • [22] A Weighted Fuzzy Clustering Method Based on Granular-Ball Computing
    Deng, Qiao
    Xie, Jiang
    Hu, Hongxia
    Dai, Minggao
    2024 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS, ICCCS 2024, 2024, : 1350 - 1355
  • [23] An SNN-DBSCAN Based Clustering Algorithm for Big Data
    Pandey, Sriniwas
    Samal, Mamata
    Mohanty, Sraban Kumar
    ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 127 - 137
  • [24] DSDBSCAN: A novel clustering algorithm based on double sampling for DBSCAN
    Wu, Q. (qfwu@xmu.edu.cn), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (10):
  • [25] Research on Anomaly Detection Method Based on DBSCAN Clustering Algorithm
    Deng, Dingsheng
    2020 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, COMPUTER TECHNOLOGY AND TRANSPORTATION (ISCTT 2020), 2020, : 439 - 442
  • [26] A NEW DENSITY BASED SAMPLING TO ENHANCE DBSCAN CLUSTERING ALGORITHM
    Al-mamory, Safaa O.
    Kamil, Israa S.
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2019, 32 (04) : 315 - 327
  • [27] Wafer map preprocessing based on optimized DBSCAN clustering algorithm
    Chen S.-H.
    Yi M.-L.
    Zhang Y.-X.
    Shang Y.-L.
    Yang P.
    Yang, Ping (yangping1964@163.com), 1600, Northeast University (36): : 2713 - 2721
  • [28] Traffic Accident Location Clustering Based on Improved DBSCAN Algorithm
    Huang G.
    Qu W.-B.
    Xu H.-Y.
    Jiaotong Yunshu Xitong Gongcheng Yu Xinxi/Journal of Transportation Systems Engineering and Information Technology, 2020, 20 (05): : 169 - 176
  • [29] Laser Radar Data Registration Algorithm Based on DBSCAN Clustering
    Liu, Yiting
    Zhang, Lei
    Li, Peijuan
    Jia, Tong
    Du, Junfeng
    Liu, Yawen
    Li, Rui
    Yang, Shutao
    Tong, Jinwu
    Yu, Hanqi
    ELECTRONICS, 2023, 12 (06)
  • [30] Inverse Halftoning Algorithm Based on SLIC Superpixels and DBSCAN Clustering
    Zhang, Fan
    Li, Zhenzhen
    Qu, Xingxing
    Zhang, Xinhong
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2018, PT III, 2018, 10956 : 466 - 471