Fast Density Clustering Algorithm for Numerical Data and Categorical Data

被引:9
|
作者
Chen Jinyin [1 ]
He Huihao [1 ]
Chen Jungan [2 ]
Yu Shanqing [1 ]
Shi Zhaoxia [1 ]
机构
[1] Zhejiang Univ Technol, Hangzhou 310023, Zhejiang, Peoples R China
[2] Ningbo Wanli Univ, Dept Elect Engn, Ningbo 310023, Zhejiang, Peoples R China
基金
浙江省自然科学基金; 中国国家自然科学基金;
关键词
MIXED DATA;
D O I
10.1155/2017/6393652
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data objects with mixed numerical and categorical attributes are often dealt with in the real world. Most existing algorithms have limitations such as low clustering quality, cluster center determination difficulty, and initial parameter sensibility. A fast density clustering algorithm (FDCA) is put forward based on one-time scan with cluster centers automatically determined by center set algorithm (CSA). A novel data similarity metric is designed for clustering data including numerical attributes and categorical attributes. CSA is designed to choose cluster centers from data object automatically which overcome the cluster centers setting difficulty in most clustering algorithms. The performance of the proposed method is verified through a series of experiments on ten mixed data sets in comparison with several other clustering algorithms in terms of the clustering purity, the efficiency, and the time complexity.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] K-modestream algorithm for clustering categorical data streams
    Ravi Sankar Sangam
    Hari Om
    CSI Transactions on ICT, 2017, 5 (3) : 295 - 303
  • [42] A Global K-modes Algorithm for Clustering Categorical Data
    Bai Tian
    Kulikowski, C. A.
    Gong Leiguang
    Yang Bin
    Huang Lan
    Zhou Chunguang
    CHINESE JOURNAL OF ELECTRONICS, 2012, 21 (03): : 460 - 465
  • [43] An entropy-based subspace clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 272 - 277
  • [44] PARTCAT: A subspace clustering algorithm for high dimensional categorical data
    Gan, Guojun
    Wu, Jianhong
    Yang, Zijiang
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 4406 - +
  • [45] An Algorithm for Clustering Categorical Data With Set-Valued Features
    Cao, Fuyuan
    Huang, Joshua Zhexue
    Liang, Jiye
    Zhao, Xingwang
    Meng, Yinfeng
    Feng, Kai
    Qian, Yuhua
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (10) : 4593 - 4606
  • [46] A fuzzy mixed data clustering algorithm by fast search and find of density peaks
    Li, Ye
    Chen, Yiyan
    Li, Qun
    INTELLIGENT DATA ANALYSIS, 2019, 23 : S199 - S224
  • [47] A data labeling method for clustering categorical data
    Cao, Fuyuan
    Liang, Jiye
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) : 2381 - 2385
  • [48] Data Reduction Method for Categorical Data Clustering
    Rendon, Erendira
    Salvador Sanchez, J.
    Garcia, Rene A.
    Abundez, Itzel
    Gutierrez, Citlalih
    Gasca, Eduardo
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2008, PROCEEDINGS, 2008, 5290 : 143 - +
  • [49] Clustering categorical data streams
    He, Zengyou
    Xu, Xiaofei
    Deng, Shengchun
    Huang, Joshua Zhexue
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2011, 11 (04) : 185 - 192
  • [50] Subtractive Clustering for Categorical Data
    Gu, Lei
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1229 - 1232