A self-adaptive density-based clustering algorithm for varying densities datasets with strong disturbance factor

被引:0
|
作者
Cai, Zihao [1 ]
Gu, Zhaodong [1 ]
He, Kejing [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou, Peoples R China
关键词
Density-based clustering; Data partition; Information entropy; Linear disturbance factor; Self-adaptive parameter adjustment; DBSCAN; CLASSIFICATION;
D O I
10.1016/j.datak.2024.102345
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a fundamental task in data mining, aiming to group similar objects together based on their features or attributes. With the rapid increase in data analysis volume and the growing complexity of high-dimensional data distribution, clustering has become increasingly important in numerous applications, including image analysis, text mining, and anomaly detection. DBSCAN is a powerful tool for clustering analysis and is widely used in density-based clustering algorithms. However, DBSCAN and its variants encounter challenges when confronted with datasets exhibiting clusters of varying densities in intricate high-dimensional spaces affected by significant disturbance factors. A typical example is multi-density clustering connected by a few data points with strong internal correlations, a scenario commonly encountered in the analysis of crowd mobility. To address these challenges, we propose a Self-adaptive Density- Based Clustering Algorithm for Varying Densities Datasets with Strong Disturbance Factor (SADBSCAN). This algorithm comprises a data block splitter, a local clustering module, a global clustering module, and a data block merger to obtain adaptive clustering results. We conduct extensive experiments on both artificial and real-world datasets to evaluate the effectiveness of SADBSCAN. The experimental results indicate that SADBSCAN significantly outperforms several strong baselines across different metrics, demonstrating the high adaptability and scalability of our algorithm.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Density-Based Multiscale Analysis for Clustering in Strong Noise Settings With Varying Densities
    Zhang, Tian-Tian
    Yuan, Bo
    [J]. IEEE ACCESS, 2018, 6 : 25861 - 25873
  • [2] ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities
    Khan, Mohammad Mahmudur Rahman
    Siddique, Md. Abu Bakr
    Arif, Rezoana Bente
    Oishe, Mahjabin Rahman
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION & COMMUNICATION TECHNOLOGY (ICEEICT), 2018, : 107 - 111
  • [3] Efficient incremental density-based algorithm for clustering large datasets
    Bakr, Ahmad M.
    Ghanem, Nagia M.
    Ismail, Mohamed A.
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2015, 54 (04) : 1147 - 1154
  • [4] A density-based competitive data stream clustering network with self-adaptive distance metric
    Xu, Baile
    Shen, Furao
    Zhao, Jinxi
    [J]. NEURAL NETWORKS, 2019, 110 : 141 - 158
  • [5] An Algorithm to Adaptive Determination of Density Threshold for Density-based Clustering
    Ke, Zhang
    Lei, Huang
    Yi, Chai
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 3929 - 3935
  • [6] An efficient density-based clustering algorithm for vertically partitioned distributed datasets
    Department of Computer Science and Engineering, Southeastern University, Nanjing 210096, China
    不详
    [J]. Jisuanji Yanjiu yu Fazhan, 2007, 9 (1612-1617):
  • [7] An efficient and scalable density-based Clustering algorithm for datasets with complex structures
    Lv, Yinghua
    Ma, Tinghuai
    Tang, Meili
    Cao, Jie
    Tian, Yuan
    Al-Dhelaan, Abdullah
    Al-Rodhaan, Mznah
    [J]. NEUROCOMPUTING, 2016, 171 : 9 - 22
  • [8] An adaptive density-based clustering algorithm for spatial database with noise
    Ma, DY
    Zhang, AD
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 467 - 470
  • [9] Density-Based Clustering for Adaptive Density Variation
    Qian, Li
    Plant, Claudia
    Boehm, Christian
    [J]. 2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1282 - 1287
  • [10] A Self-Adaptive Spectral Clustering Algorithm
    Cai Xiaoyan
    Dai Guanzhong
    Yang Libin
    Zhang Guoqing
    [J]. PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 4, 2008, : 551 - 553