A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering

被引:69
|
作者
Wikaisuksakul, Siripen [1 ]
机构
[1] Prince Songkla Univ, Fac Sci & Technol, Dept Math & Comp Sci, Muang 94000, Pattani, Thailand
关键词
Clustering; Multiobjective optimization; Fuzzy clustering; Genetic algorithms; INDEX;
D O I
10.1016/j.asoc.2014.08.036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a multi-objective genetic algorithm which considers the problem of data clustering. A given dataset is automatically assigned into a number of groups in appropriate fuzzy partitions through the fuzzy c-means method. This work has tried to exploit the advantage of fuzzy properties which provide capability to handle overlapping clusters. However, most fuzzy methods are based on compactness and/or separation measures which use only centroid information. The calculation from centroid information only may not be sufficient to differentiate the geometric structures of clusters. The overlap-separation measure using an aggregation operation of fuzzy membership degrees is better equipped to handle this drawback. For another key consideration, we need a mechanism to identify appropriate fuzzy clusters without prior knowledge on the number of clusters. From this requirement, an optimization with single criterion may not be feasible for different cluster shapes. A multi-objective genetic algorithm is therefore appropriate to search for fuzzy partitions in this situation. Apart from the overlap-separation measure, the well-known fuzzy J(m) index is also optimized through genetic operations. The algorithm simultaneously optimizes the two criteria to search for optimal clustering solutions. A string of real-coded values is encoded to represent cluster centers. A number of strings with different lengths varied over a range correspond to variable numbers of clusters. These real-coded values are optimized and the Pareto solutions corresponding to a tradeoff between the two objectives are finally produced. As shown in the experiments, the approach provides promising solutions in well-separated, hyperspherical and overlapping clusters from synthetic and real-life data sets. This is demonstrated by the comparison with existing single-objective and multi-objective clustering techniques. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:679 / 691
页数:13
相关论文
共 50 条
  • [31] A fuzzy clustering model of data and fuzzy c-means
    Nascimento, S
    Mirkin, B
    Moura-Pires, F
    NINTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2000), VOLS 1 AND 2, 2000, : 302 - 307
  • [32] Automatic clustering by multi-objective genetic algorithm with numeric and categorical features
    Dutta, Dipankar
    Sil, Jaya
    Dutta, Paramartha
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 137 : 357 - 379
  • [33] A genetic hard c-means clustering algorithm
    Meng, L
    Wu, QH
    Yong, ZZ
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2002, 9 (03): : 421 - 438
  • [34] Fuzzy c-means clustering of incomplete data
    Hathaway, RJ
    Bezdek, JC
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2001, 31 (05): : 735 - 744
  • [35] Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm
    Ding, Yi
    Fu, Xian
    NEUROCOMPUTING, 2016, 188 : 233 - 238
  • [36] Comparative study of a genetic fuzzy c-means algorithm and a validity guided fuzzy c-means algorithm for locating clusters in noisy data
    Egan, MA
    Krishnamoorthy, M
    Rajan, K
    1998 IEEE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION - PROCEEDINGS, 1998, : 440 - 445
  • [37] An Efficient Genetic Algorithm with Fuzzy c-Means Clustering for Traveling Salesman Problem
    Yoon, Jong-Won
    Cho, Sung-Bae
    2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, : 1452 - 1456
  • [38] Weighted Fuzzy C-Means Clustering Based on Double Coding Genetic Algorithm
    Chen, Duo
    Cui, Du-Wu
    Wang, Chao-Xue
    INTELLIGENT COMPUTING, PART I: INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, ICIC 2006, PART I, 2006, 4113 : 622 - 633
  • [40] Distributed C-Means Data Clustering Algorithm
    Oliva, Gabriele
    Setola, Roberto
    Hadjicostis, Christoforos N.
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4396 - 4401