A k-populations algorithm for clustering categorical data

被引:22
|
作者
Kim, DW [1 ]
Lee, K
Lee, D
Lee, KH
机构
[1] Korea Adv Inst Sci & Technol, Dept BioSyst, Taejon 305701, South Korea
[2] Korea Adv Inst Sci & Technol, Adv Informat Technol Res Ctr, Taejon 305701, South Korea
[3] Korea Adv Inst Sci & Technol, Dept Elect Engn & Comp Sci, Taejon 305701, South Korea
关键词
clustering; categorical data; hierarchical algorithm; k-modes algorithm; fuzzy k-modes algorithm;
D O I
10.1016/j.patcog.2004.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1131 / 1134
页数:4
相关论文
共 50 条
  • [41] Hypothesis Test to Compare the Equality Among k-populations
    Martinez-Camblor, Pablo
    REVISTA COLOMBIANA DE ESTADISTICA, 2008, 31 (01): : 1 - 18
  • [42] A Comparative Analysis of Rough Intuitionistic Fuzzy K-Mode Algorithm for Clustering Categorical Data
    Tripathy, B. K.
    Goyal, Akarsh
    Sourav, Patra Anupam
    RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2016, 7 (05): : 2787 - 2802
  • [43] A Weight Entropy k-means Algorithm for Clustering Dataset with Mixed Numeric and Categorical Data
    Li, Taoying
    Chen, Yan
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 1, PROCEEDINGS, 2008, : 36 - 41
  • [44] k-CCM: A Center-Based Algorithm for Clustering Categorical Data with Missing Values
    Dinh, Duy-Tai
    Huynh, Van-Nam
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2018), 2018, 11144 : 267 - 279
  • [45] An efficient k-modes algorithm for clustering categorical datasets
    Dorman, Karin S.
    Maitra, Ranjan
    STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (01) : 83 - 97
  • [46] Incremental clustering algorithm of mixed numerical and categorical data based on clustering ensemble
    Li, Tao-Ying
    Chen, Yan
    Zhang, Jin-Song
    Qin, Sheng-Jun
    Kongzhi yu Juece/Control and Decision, 2012, 27 (04): : 603 - 608
  • [47] QROCK: A quick version of the ROCK algorithm for clustering of categorical data
    Dutta, M
    Mahanta, AK
    Pujari, AK
    PATTERN RECOGNITION LETTERS, 2005, 26 (15) : 2364 - 2373
  • [48] Clustering categorical data: Soft rounding k-modes
    Gavva, Surya Teja
    Karthik, C. S.
    Punna, Sharath
    INFORMATION AND COMPUTATION, 2024, 296
  • [49] A method for k-means-like clustering of categorical data
    Nguyen T.-H.T.
    Dinh D.-T.
    Sriboonchitta S.
    Huynh V.-N.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (11) : 15011 - 15021
  • [50] A Genetic Algorithm Based Ensemble Approach for Categorical Data Clustering
    Goswami, Jyoti Prokash
    Mahanta, Anjana Kakoti
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,