Kernel-Based k-Representatives Algorithm for Fuzzy Clustering of Categorical Data

被引:1
|
作者
Mau, Toan Nguyen [1 ]
Huynh, Van-Nam [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Adv Sci & Technol, Nomi, Ishikawa, Japan
关键词
Fuzzy clustering; Fuzzy silhouette; Categorical data; k-representatives; MODES ALGORITHM;
D O I
10.1109/FUZZ45933.2021.9494597
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fuzzy cluster analysis plays an essential role in addressing unclear boundaries between clusters in data and aims to group objects into fuzzy clusters based on their similarities. In this paper, we propose a new method for fuzzy clustering of data with categorical attributes. Specifically, we first introduce a method for kernel-based representation of cluster centers in which the underlying distribution of categorical values within a cluster center is estimated as a weighted sum of the uniform distribution and their frequency distribution. We then extend the k-centers clustering method by applying this newly proposed method of cluster center presentation for fuzzy clustering of categorical data. The effectiveness and efficiency of the proposed method are demonstrated by conducting experiments on 16 real-world datasets and comparing the results with those of existing methods. In addition, our research can be regarded as the first attempt to apply a fuzzy silhouette scoring method that includes internal coherence and external separation of fuzzy clusters into clustering of categorical data.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] An LSH-based k-representatives clustering method for large categorical data
    Mau, Toan Nguyen
    Huynh, Van-Nam
    [J]. NEUROCOMPUTING, 2021, 463 : 29 - 44
  • [2] A kernel-based fuzzy clustering algorithm
    Wang, Jiun-Hau
    Lee, Wan-Jui
    Lee, Shie-Jue
    [J]. ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 1, PROCEEDINGS, 2006, : 550 - +
  • [3] Kernel-Based Fuzzy Clustering of Interval Data
    Pimentel, Bruno A.
    da Costa, Anderson F. B. F.
    de Souza, Renata M. C. R.
    [J]. IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 497 - 501
  • [4] Clustering Categorical Data Based on Representatives
    Aranganayagi, S.
    Thangavel, K.
    [J]. THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 599 - +
  • [5] Kernel-based deterministic annealing algorithm for data clustering
    Yang, X. L.
    Song, Q.
    Zhang, W. B.
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2006, 153 (05): : 557 - 568
  • [6] A kernel-based and sample-weighted fuzzy clustering algorithm
    Xia, Shixiong
    Liu, Qiang
    Zhou, Yong
    Liu, Bing
    [J]. 2011 INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND NEURAL COMPUTING (FSNC 2011), VOL I, 2011, : 113 - 116
  • [7] Clustering incomplete data using kernel-based fuzzy C-means algorithm
    Zhang, DQ
    Chen, SC
    [J]. NEURAL PROCESSING LETTERS, 2003, 18 (03) : 155 - 162
  • [8] Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm
    Dao-Qiang Zhang
    Song-Can Chen
    [J]. Neural Processing Letters, 2003, 18 : 155 - 162
  • [9] Performance of kernel-based fuzzy clustering
    Graves, D.
    Pedrycz, W.
    [J]. ELECTRONICS LETTERS, 2007, 43 (25) : 1445 - 1446
  • [10] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452