Validity index for crisp and fuzzy clusters

被引:519
|
作者
Pakhira, MK [1 ]
Bandyopadhyay, S
Maulik, U
机构
[1] Kalyani Govt Engn Coll, Dept Comp Sci & Technol, Kalyani 741235, W Bengal, India
[2] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, W Bengal, India
[3] Kalyani Govt Engn Coll, Dept Comp Sci & Technol, Kalyani 741235, W Bengal, India
关键词
clustering; expectation maximization algorithm; fuzzy c-means algorithm; k-means algorithm; unsupervised classification; validity index;
D O I
10.1016/j.patcog.2003.06.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, a cluster validity index and its fuzzification is described, which can provide a measure of goodness of clustering on different partitions of a data set. The maximum value of this index, called the PBM-index, across the hierarchy provides the best partitioning. The index is defined as a product of three factors, maximization of which ensures the formation of a small number of compact clusters with large separation between at least two clusters. We have used both the k-means and the expectation maximization algorithms as underlying crisp clustering techniques. For fuzzy clustering, we have utilized the well-known fuzzy c-means algorithm. Results demonstrating the superiority of the PBM-index in appropriately determining the number of clusters, as compared to three other well-known measures, the Davies-Bouldin index, Dunn's index and the Xie-Beni index, are provided for several artificial and real-life data sets. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:487 / 501
页数:15
相关论文
共 50 条