Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

被引:4
|
作者
Mau, Toan Nguyen [1 ]
Huynh, Van-Nam [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Adv Sci & Technol, Nomi, Ishikawa, Japan
关键词
Fuzzy clustering; Categorical data; k-representatives; k-centers; MODES ALGORITHM;
D O I
10.1007/978-3-030-85529-1_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cluster analysis plays an important role in exploring the correlations in data by dividing datasets into separate clusters so that similar objects are located in the same cluster. Moreover, fuzzy cluster analysis can reveal the mixtures of clusters in datasets containing multiple distributions. Certainly, the outcome of clustering methods is approximately determined by the similarity definition. Thus, the similarity measurement is exceedingly important to the formation of fuzzy clusters. In fact, the similarity between two objects is mostly calculated by the mean of differences across multiple dimensions. However, the dissimilarity in some dimensions has little or no effect on the fuzzy clustering outcome. In this study, we explore such impacts for fuzzy clustering of data with categorical attributes. Accordingly, the impact of each attribute on each fuzzy cluster is calculated using an optimizer, and the overlapping dissimilar values are then adjusted by the corresponding weights. We propose to apply this approach to the Fk-centers clustering algorithm, and the experimental results show that our proposed method can achieve higher fuzzy silhouette scores than other related works. These results demonstrate the applicability of deploying of the proposed method in real-world application.
引用
收藏
页码:205 / 217
页数:13
相关论文
共 50 条
  • [21] Optimization of the Numeric and Categorical Attribute Weights in KAMILA Mixed Data Clustering Algorithm
    Martarelli, Nadia Junqueira
    Nagano, Marcelo Seido
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 20 - 27
  • [22] A Comparison of Categorical Attribute Data Clustering Methods
    Hautamaki, Ville
    Pollanen, Antti
    Kinnunen, Tomi
    Lee, Kong Aik
    Li, Haizhou
    Franti, Pasi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 53 - 62
  • [23] K-Centers Clustering Protocol over Heterogeneous Wireless Sensor Networks
    Xie, Qing Yan
    Cheng, Yizong
    Zeng, Qing-An
    INTERNATIONAL JOURNAL OF INTERDISCIPLINARY TELECOMMUNICATIONS AND NETWORKING, 2014, 6 (04) : 42 - 54
  • [24] Categorical fuzzy k-modes clustering with automated feature weight learning
    Saha, Arkajyoti
    Das, Swagatam
    NEUROCOMPUTING, 2015, 166 : 422 - 435
  • [25] K-distributions: A new algorithm for clustering categorical data
    Cai, Zhihua
    Wang, Dianhong
    Jiang, Liangxiao
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2007, 4682 : 436 - 443
  • [26] A Global K-modes Algorithm for Clustering Categorical Data
    Bai Tian
    Kulikowski, C. A.
    Gong Leiguang
    Yang Bin
    Huang Lan
    Zhou Chunguang
    CHINESE JOURNAL OF ELECTRONICS, 2012, 21 (03): : 460 - 465
  • [27] A genetic k-modes algorithm for clustering categorical data
    Gan, GJ
    Yang, ZJ
    Wu, JH
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 195 - 202
  • [28] A modified K-means algorithm for categorical data clustering
    Sun, Y
    Zhu, QM
    Chen, ZX
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 31 - 37
  • [29] K-modestream algorithm for clustering categorical data streams
    Ravi Sankar Sangam
    Hari Om
    CSI Transactions on ICT, 2017, 5 (3) : 295 - 303
  • [30] A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes
    Cao, Fuyuan
    Huang, Joshua Zhexue
    Liang, Jiye
    APPLIED MATHEMATICS AND COMPUTATION, 2017, 295 : 1 - 15