Clustering categorical data: Soft rounding k-modes

被引:0
|
作者
Gavva, Surya Teja [1 ]
Karthik, C. S. [1 ]
Punna, Sharath [1 ]
机构
[1] Rutgers State Univ, Piscataway, NJ 08854 USA
关键词
ALGORITHM;
D O I
10.1016/j.ic.2023.105115
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the last three decades, researchers have intensively explored various clustering tools for categorical data analysis. Despite the proposal of various clustering algorithms, the classical k-modes algorithm remains a popular choice for unsupervised learning of categorical data. Surprisingly, our first insight is that in a natural generative block model, the k-modes algorithm performs poorly for a large range of parameters. We remedy this issue by proposing a soft rounding variant of the k-modes algorithm (SoftModes) and theoretically prove that our variant addresses the drawbacks of the k-modes algorithm in the generative model. Finally, we empirically verify that SoftModes performs well on both synthetic and real-world datasets.(c) 2023 Elsevier Inc. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Initialization of K-Modes Clustering for Categorical Data
    Li Tao-ying
    Chen Yan
    Jin Zhi-hong
    Li Ye
    [J]. 2013 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING (ICMSE), 2013, : 107 - 112
  • [2] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452
  • [3] A Global K-modes Algorithm for Clustering Categorical Data
    Bai Tian
    Kulikowski, C. A.
    Gong Leiguang
    Yang Bin
    Huang Lan
    Zhou Chunguang
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2012, 21 (03) : 460 - 465
  • [4] A genetic k-modes algorithm for clustering categorical data
    Gan, GJ
    Yang, ZJ
    Wu, JH
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 195 - 202
  • [5] Clustering of Categorical Data Using Intuitionistic Fuzzy k-modes
    Mehta, Darshan
    Tripathy, B. K.
    [J]. PROCEEDINGS OF SIXTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2016), VOL 1, 2017, 546 : 254 - 263
  • [6] A weighting k-modes algorithm for subspace clustering of categorical data
    Cao, Fuyuan
    Liang, Jiye
    Li, Deyu
    Zhao, Xingwang
    [J]. NEUROCOMPUTING, 2013, 108 : 23 - 30
  • [7] A genetic fuzzy k-Modes algorithm for clustering categorical data
    Gan, G.
    Wu, J.
    Yang, Z.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1615 - 1620
  • [8] An efficient k-modes algorithm for clustering categorical datasets
    Dorman, Karin S.
    Maitra, Ranjan
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (01) : 83 - 97
  • [9] Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering
    Xiao, Yiyong
    Huang, Changhao
    Huang, Jiaoying
    Kaku, Ikou
    Xu, Yuchun
    [J]. PATTERN RECOGNITION, 2019, 90 : 183 - 195
  • [10] The k-modes type clustering plus between-cluster information for categorical data
    Bai, Liang
    Liang, Jiye
    [J]. NEUROCOMPUTING, 2014, 133 : 111 - 121