Using the Negentropy Increment to Determine the Number of Clusters

被引:0
|
作者
Lago-Fernandez, Luis F. [1 ]
Corbacho, Fernando [2 ]
机构
[1] Univ Autonoma Madrid, Escuela Politecn Super, Dept Ingn Informat, E-28049 Madrid, Spain
[2] Cognodata Consulting, E-28010 Madrid, Spain
关键词
Crisp clustering; Cluster validation; Negentropy; MIXTURE MODEL; VALIDITY; NEC;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new validity index for crisp clustering that is based on the average normality of the clusters. A normal cluster is optimal in the sense of maximum uncertainty, or minimum structure, and so performing further partitions on it will not reveal additional substructures. To characterize the normality of a cluster we use the negentropy, a standard measure of distance to normality which evaluates the difference between the cluster's entropy and the entropy of a normal distribution with the same covariance matrix. Although the definition of the negentropy involves the differential entropy, we show that it is possible to avoid its explicit computation by considering only negentropy increments with respect to the initial data distribution. The resulting negentropy increment validity index only requires the computation of determinants of covariance matrices. We have applied the index to randomly generated problems, and show that it provides better results than other indices for the assessment of the number of clusters.
引用
收藏
页码:448 / +
页数:3
相关论文
共 50 条
  • [21] Estimating the number of clusters using a windowing technique
    Boutsinas B.
    Tasoulis D.K.
    Vrahatis M.N.
    [J]. Pattern Recognition and Image Analysis, 2006, 16 (2) : 143 - 154
  • [22] Performing Multi-Objective Optimization Alongside Dimension Reduction to Determine Number of Clusters
    Mollaian, Melisa
    Dorgo, Gyula
    Palazoglu, Ahmet
    [J]. PROCESSES, 2022, 10 (05)
  • [23] QSAR on some anaesthetics and narcotics using negentropy
    Srivastava, AK
    Upadhayaya, M
    Khan, AA
    [J]. INDIAN JOURNAL OF CHEMISTRY SECTION B-ORGANIC CHEMISTRY INCLUDING MEDICINAL CHEMISTRY, 1997, 36 (12): : 1189 - 1193
  • [24] Using the Outskirts of Galaxy Clusters to Determine Their Mass Accretion Rate
    De Boni, Cristiano
    [J]. GALAXIES, 2016, 4 (04)
  • [25] On the number of clusters
    Hardy, A
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 23 (01) : 83 - 96
  • [26] Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation
    Rykov, Andrei
    de Amorim, Renato Cordeiro
    Makarenkov, Vladimir
    Mirkin, Boris
    [J]. IEEE ACCESS, 2024, 12 : 11761 - 11773
  • [27] Automatic determination of the number of clusters using spectral algorithms
    Sanguinetti, G
    Laidler, J
    Lawrence, ND
    [J]. 2005 IEEE Workshop on Machine Learning for Signal Processing (MLSP), 2005, : 55 - 60
  • [28] Finding the Optimal Number of Clusters Using Genetic Algorithms
    Liu, Yongguo
    Ye, Mao
    Peng, Jun
    Wu, Hong
    [J]. 2008 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 680 - +
  • [29] Finding the Number of Clusters Using a Small Training Sequence
    Kim, Dong Sik
    [J]. IEEE ACCESS, 2023, 11 : 25932 - 25940
  • [30] Estimating the Number of Clusters Using Cross-Validation
    Fu, Wei
    Perry, Patrick O.
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2020, 29 (01) : 162 - 173