Initialization of K-Modes Clustering for Categorical Data

被引:0
|
作者
Li Tao-ying [1 ]
Chen Yan [1 ]
Jin Zhi-hong [1 ]
Li Ye [1 ]
机构
[1] Dalian Maritime Univ, Transportat Management Coll, Dalian 116026, Peoples R China
关键词
categorical data; density and grid measure; initialization of clustering; the k-modes clustering; MEANS ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The k-modes clustering algorithm is undoubtedly one of the most widely used partitional algorithms for categorical data. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initialization of clustering. Categorical initialization methods have been proposed to address this problem. In this paper, we present an overview of initialization methods of clustering for numerical data and categorical data respectively with an emphasis on their computational efficiency. We then propose a new initialization method for categorical data, which can obtain the good initial cluster centers using the new distance base on the RD, and explore the methods of density and grid. Finally, proposed method has been tested on diagnosis dataset, a real world data set from UCI Machine Learning Repository, and been analyzed the experimental results, which illustrates that the proposed method is effective and efficient for initializing categorical data.
引用
收藏
页码:107 / 112
页数:6
相关论文
共 50 条
  • [41] Privacy-preserving mechanisms for k-modes clustering
    Huu Hiep Nguyen
    [J]. COMPUTERS & SECURITY, 2018, 78 : 60 - 75
  • [42] A Support Based Initialization Algorithm for Categorical Data Clustering
    Kumar, Ajay
    Kumar, Shishir
    [J]. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2018, 11 (02) : 53 - 67
  • [43] A cluster centers initialization method for clustering categorical data
    Bai, Liang
    Liang, Jiye
    Dang, Chuangyin
    Cao, Fuyuan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (09) : 8022 - 8029
  • [44] On the impact of dissimilarity measure in k-modes clustering algorithm
    Ng, Michael K.
    Li, Mark Junjie
    Huang, Joshua Zhexue
    He, Zengyou
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (03) : 503 - 507
  • [45] Feature-Weighted Fuzzy K-Modes Clustering
    Nataliani, Yessica
    Yang, Miin-Shen
    [J]. 2019 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, METAHEURISTICS & SWARM INTELLIGENCE (ISMSI 2019), 2019, : 63 - 68
  • [46] BINARY CODES K-MODES CLUSTERING FOR HSI SEGMENTATION
    Berthier, Michel
    El Asmar, Saadallah
    Frelicot, Carl
    [J]. 2016 IEEE 12TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2016,
  • [47] FKMAWCW: Categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning
    Oskouei, Amin Golzari
    Balafar, Mohammad Ali
    Motamed, Cina
    [J]. CHAOS SOLITONS & FRACTALS, 2021, 153
  • [48] Shuffled differential privacy protection method for K-Modes clustering data collection and publication
    Jiang, Weijin
    Chen, Yilin
    Han, Yuqing
    Wu, Yuting
    Zhou, Wei
    Wang, Haijuan
    [J]. Tongxin Xuebao/Journal on Communications, 2024, 45 (01): : 201 - 213
  • [49] Semantically Enhanced Clustering in Retail Using Possibilistic K-Modes
    Ammar, Asma
    Elouedi, Zied
    Lingras, Pawan
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 753 - 764
  • [50] Computation of Initial Modes for K-modes Clustering Algorithm using Evidence Accumulation
    Khan, Shehroz S.
    Kant, Shri
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2784 - 2789