A subspace hierarchical clustering algorithm for categorical data

被引:1
|
作者
Carbonera, Joel Luis [1 ]
Abel, Mara [1 ]
机构
[1] Univ Fed Rio Grande do Sul, Porto Alegre, RS, Brazil
关键词
K-MEANS;
D O I
10.1109/ICTAI.2019.00077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a soft subspace hierarchical clustering for dealing with categorical data. The proposed algorithm extends the traditional agglomerative hierarchical clustering approach for identifying clusters of categorical data in subspaces. The algorithm adopts a correlation-based approach for measuring the relevance of each categorical attribute during the clustering process. We performed experiments on six well-known datasets, comparing the performance of our algorithms with the original agglomerative algorithm for hierarchical clustering and other five partitional subspace clustering algorithms, using two well-known evaluation metrics: accuracy and f-measure. According to the experiments, the proposed algorithm outperforms the original one. Besides that, the proposed algorithm outperforms most of the partitional algorithms, while provides additional advantages.
引用
收藏
页码:509 / 516
页数:8
相关论文
共 50 条
  • [42] SCLOPE: An algorithm for clustering data streams of categorical attributes
    Ong, KL
    Li, WY
    Ng, WK
    Lim, EP
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2004, 3181 : 209 - 218
  • [43] A k-populations algorithm for clustering categorical data
    Kim, DW
    Lee, K
    Lee, D
    Lee, KH
    [J]. PATTERN RECOGNITION, 2005, 38 (07) : 1131 - 1134
  • [44] Fuzzy Clustering Ensemble Algorithm for Partitioning Categorical Data
    Li, Taoying
    Chen, Yan
    [J]. 2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 170 - 174
  • [45] Performances of parallel clustering algorithm for categorical and mixed data
    Hai, NTM
    Susumu, H
    [J]. PARALLEL AND DISTRIBUTED COMPUTING: APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2004, 3320 : 252 - 256
  • [46] A fuzzy subspace algorithm for clustering high dimensional data
    Can, Guojun
    Wu, Jianhong
    Yang, Zijiang
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 271 - 278
  • [47] Data clustering algorithm based on binary subspace division
    Wang, HB
    Wang, CB
    Zhang, LF
    Zhou, DR
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1249 - 1253
  • [48] A SUBSPACE CLUSTERING ALGORITHM
    Zhang, Qiang
    [J]. 2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [49] Incremental clustering algorithm of mixed numerical and categorical data based on clustering ensemble
    Li, Tao-Ying
    Chen, Yan
    Zhang, Jin-Song
    Qin, Sheng-Jun
    [J]. Kongzhi yu Juece/Control and Decision, 2012, 27 (04): : 603 - 608
  • [50] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452