Fast Density Clustering Algorithm for Numerical Data and Categorical Data

被引:9
|
作者
Chen Jinyin [1 ]
He Huihao [1 ]
Chen Jungan [2 ]
Yu Shanqing [1 ]
Shi Zhaoxia [1 ]
机构
[1] Zhejiang Univ Technol, Hangzhou 310023, Zhejiang, Peoples R China
[2] Ningbo Wanli Univ, Dept Elect Engn, Ningbo 310023, Zhejiang, Peoples R China
基金
浙江省自然科学基金; 中国国家自然科学基金;
关键词
MIXED DATA;
D O I
10.1155/2017/6393652
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data objects with mixed numerical and categorical attributes are often dealt with in the real world. Most existing algorithms have limitations such as low clustering quality, cluster center determination difficulty, and initial parameter sensibility. A fast density clustering algorithm (FDCA) is put forward based on one-time scan with cluster centers automatically determined by center set algorithm (CSA). A novel data similarity metric is designed for clustering data including numerical attributes and categorical attributes. CSA is designed to choose cluster centers from data object automatically which overcome the cluster centers setting difficulty in most clustering algorithms. The performance of the proposed method is verified through a series of experiments on ten mixed data sets in comparison with several other clustering algorithms in terms of the clustering purity, the efficiency, and the time complexity.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] A New Weight Based Density Peaks Clustering Algorithm for Numerical and Categorical Data
    Tong, Wuning
    Wang, Yuping
    Zhong, Junkun
    Yan, Wei
    2017 13TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2017, : 169 - 172
  • [2] Density-based clustering algorithm for numerical and categorical data with mixed distance measure methods
    Chen, Jin-Yin
    He, Hui-Hao
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2015, 32 (08): : 993 - 1002
  • [3] Incremental clustering algorithm of mixed numerical and categorical data based on clustering ensemble
    Li, Tao-Ying
    Chen, Yan
    Zhang, Jin-Song
    Qin, Sheng-Jun
    Kongzhi yu Juece/Control and Decision, 2012, 27 (04): : 603 - 608
  • [4] HABOS clustering algorithm for categorical data
    Wu, Sen (wusen@manage.ustb.edu.cn), 2016, Science Press (38):
  • [5] Clustering algorithm for Boolean and categorical data
    Liu, H.
    Deng, H.
    Lu, S.
    Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 2001, 29 (03): : 30 - 32
  • [6] Clustering categorical data using coverage density
    Yan, H
    Zhang, L
    Zhang, Y
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 248 - 255
  • [7] Kernel Subspace Clustering Algorithm for Categorical Data
    Xu K.-P.
    Chen L.-F.
    Sun H.-J.
    Wang B.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (11): : 3492 - 3505
  • [8] Squeezer: An efficient algorithm for clustering categorical data
    He, ZY
    Xu, XF
    Deng, SC
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (05) : 611 - 624
  • [9] A hierarchical clustering algorithm for categorical sequence data
    Oh, SJ
    Kim, JY
    INFORMATION PROCESSING LETTERS, 2004, 91 (03) : 135 - 140
  • [10] Coercion: A Distributed Clustering Algorithm for Categorical Data
    Wang, Bin
    Zhou, Yang
    Hei, Xinhong
    2013 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2013, : 683 - 687