Fast Density Clustering Algorithm for Numerical Data and Categorical Data

被引:9
|
作者
Chen Jinyin [1 ]
He Huihao [1 ]
Chen Jungan [2 ]
Yu Shanqing [1 ]
Shi Zhaoxia [1 ]
机构
[1] Zhejiang Univ Technol, Hangzhou 310023, Zhejiang, Peoples R China
[2] Ningbo Wanli Univ, Dept Elect Engn, Ningbo 310023, Zhejiang, Peoples R China
基金
浙江省自然科学基金; 中国国家自然科学基金;
关键词
MIXED DATA;
D O I
10.1155/2017/6393652
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data objects with mixed numerical and categorical attributes are often dealt with in the real world. Most existing algorithms have limitations such as low clustering quality, cluster center determination difficulty, and initial parameter sensibility. A fast density clustering algorithm (FDCA) is put forward based on one-time scan with cluster centers automatically determined by center set algorithm (CSA). A novel data similarity metric is designed for clustering data including numerical attributes and categorical attributes. CSA is designed to choose cluster centers from data object automatically which overcome the cluster centers setting difficulty in most clustering algorithms. The performance of the proposed method is verified through a series of experiments on ten mixed data sets in comparison with several other clustering algorithms in terms of the clustering purity, the efficiency, and the time complexity.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Efficient layered density-based clustering of categorical data
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Labudde, Dirk
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) : 365 - 376
  • [32] Hierarchical density-based clustering of categorical data and a simplification
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 11 - +
  • [33] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452
  • [34] QROCK: A quick version of the ROCK algorithm for clustering of categorical data
    Dutta, M
    Mahanta, AK
    Pujari, AK
    PATTERN RECOGNITION LETTERS, 2005, 26 (15) : 2364 - 2373
  • [35] K-distributions: A new algorithm for clustering categorical data
    Cai, Zhihua
    Wang, Dianhong
    Jiang, Liangxiao
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2007, 4682 : 436 - 443
  • [36] A Genetic Algorithm Based Ensemble Approach for Categorical Data Clustering
    Goswami, Jyoti Prokash
    Mahanta, Anjana Kakoti
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [37] A Categorical Data Clustering Algorithm and Its Efficient Parallel Implementation
    Ding, Xiangwu
    Tan, Jia
    Wang, Mei
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2016, : 224 - 228
  • [38] A modified K-means algorithm for categorical data clustering
    Sun, Y
    Zhu, QM
    Chen, ZX
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 31 - 37
  • [39] A genetic k-modes algorithm for clustering categorical data
    Gan, GJ
    Yang, ZJ
    Wu, JH
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 195 - 202
  • [40] Algorithm for fuzzy clustering of mixed data with numeric and categorical attributes
    Ahmad, A
    Dey, L
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 561 - 572