A Distance and Density-based Clustering Algorithm using Automatic Peak Detection

被引:8
|
作者
Zhou, Rong [1 ]
Zhang, Shuang [2 ]
Chen, Chun [1 ]
Ning, Li [1 ]
Zhang, Yong [1 ]
Feng, Shengzhong [1 ]
Liu, Yi [1 ]
Luktarhan, Nurbol [3 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Shenzhen, Peoples R China
[2] Jining 1 Peoples Hosp, Jining, Peoples R China
[3] Xinjiang Univ, Informat Sci & Engn Coll, Urumqi, Peoples R China
关键词
clustering algorithm; distance-based; density-based; density peak;
D O I
10.1109/SmartCloud.2016.39
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Distance-based and density-based clustering algorithms are often used on large spatial and arbitrary shape of data sets. However, some well-known clustering algorithms have troubles when distribution of objects in the dataset varies, and this may lead to a bad clustering result. Such bad performances are more dramatically significant on high-dimensional dataset. Recently, Rodriguez and Laio proposed an efficient clustering algorithm [1] based on two essential indicators: density and distance, which are used to find the cluster centers and play an important role in the process of clustering. However, this algorithm does not work well on high dimensional data sets, since the threshold of cluster centers has been defined ambiguously and hence it has to be decided visually and manually. In this paper, an alternative definition of the indicators is introduced and the threshold of cluster centers is automatically decided by using an improved Canopy algorithm. With fixed centers (each represents a cluster), each remaining data object is assigned to a cluster dependently in a single step. The performance of the algorithm is analyzed on several benchmarks. The experimental results show that (1) the clustering performance on some high dimensional data sets, e.g., intrusion detection, is better; and (2) on low dimensional data sets, the performances are as good as the traditional clustering algorithms.
引用
收藏
页码:176 / 183
页数:8
相关论文
共 50 条
  • [21] GrDBSCAN: A Granular Density-Based Clustering Algorithm
    Suchy, Dawid
    Siminski, Krzysztof
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2023, 33 (02) : 297 - 312
  • [22] EFFICIENT DENSITY-BASED PARTITIONAL CLUSTERING ALGORITHM
    Alamgir, Zareen
    Naveed, Hina
    COMPUTING AND INFORMATICS, 2021, 40 (06) : 1322 - 1344
  • [23] A density-based clustering algorithm for earthquake zoning
    Scitovski, Sanja
    COMPUTERS & GEOSCIENCES, 2018, 110 : 90 - 95
  • [24] A density-based fuzzy exemplar clustering algorithm
    Zhou J.
    Jiang Z.-B.
    Zhang Y.-P.
    Wang S.-T.
    Kongzhi yu Juece/Control and Decision, 2020, 35 (05): : 1123 - 1133
  • [25] ADCN: An Anisotropic Density-Based Clustering Algorithm
    Mai, Gengchen
    Janowicz, Krzysztof
    Hu, Yingjie
    Gao, Song
    24TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2016), 2016,
  • [26] A Density-Based Clustering Algorithm with Educational Applications
    Wang, Zitong
    Kang, Peng
    Wu, Zewei
    Rao, Yanghui
    Wang, Fu Lee
    CURRENT DEVELOPMENTS IN WEB BASED LEARNING, ICWL 2015, 2016, 9584 : 118 - 127
  • [27] Incremental grid density-based clustering algorithm
    Chen, Ning
    Chen, An
    Zhou, Long-Xiang
    Ruan Jian Xue Bao/Journal of Software, 2002, 13 (01): : 1 - 7
  • [28] Community Detection in Complex Networks Using Nonnegative Matrix Factorization and Density-Based Clustering Algorithm
    Hong Lu
    Qinghua Zhao
    Xiaoshuang Sang
    Jianfeng Lu
    Neural Processing Letters, 2020, 51 : 1731 - 1748
  • [29] MIDBSCAN: An Efficient Density-Based Clustering Algorithm
    Tsai, Cheng-Fa
    Sung, Chun-Yi
    SIXTH INTERNATIONAL SYMPOSIUM ON NEURAL NETWORKS (ISNN 2009), 2009, 56 : 469 - 479
  • [30] TOBAE: A Density-based Agglomerative Clustering Algorithm
    Khalid, Shehzad
    Razzaq, Shahid
    JOURNAL OF CLASSIFICATION, 2015, 32 (02) : 241 - 267