Using the Negentropy Increment to Determine the Number of Clusters

被引:0
|
作者
Lago-Fernandez, Luis F. [1 ]
Corbacho, Fernando [2 ]
机构
[1] Univ Autonoma Madrid, Escuela Politecn Super, Dept Ingn Informat, E-28049 Madrid, Spain
[2] Cognodata Consulting, E-28010 Madrid, Spain
关键词
Crisp clustering; Cluster validation; Negentropy; MIXTURE MODEL; VALIDITY; NEC;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new validity index for crisp clustering that is based on the average normality of the clusters. A normal cluster is optimal in the sense of maximum uncertainty, or minimum structure, and so performing further partitions on it will not reveal additional substructures. To characterize the normality of a cluster we use the negentropy, a standard measure of distance to normality which evaluates the difference between the cluster's entropy and the entropy of a normal distribution with the same covariance matrix. Although the definition of the negentropy involves the differential entropy, we show that it is possible to avoid its explicit computation by considering only negentropy increments with respect to the initial data distribution. The resulting negentropy increment validity index only requires the computation of determinants of covariance matrices. We have applied the index to randomly generated problems, and show that it provides better results than other indices for the assessment of the number of clusters.
引用
收藏
页码:448 / +
页数:3
相关论文
共 50 条
  • [41] New techniques to determine ages of open clusters using white dwarfs
    Jeffery, E. J.
    von Hippel, T.
    Jefferys, W. H.
    Winget, D. E.
    Stein, N.
    DeGennaro, S.
    [J]. ASTROPHYSICAL JOURNAL, 2007, 658 (01): : 391 - 395
  • [42] A new validity clustering index-based on finding new centroid positions using the mean of clustered data to determine the optimum number of clusters
    Abdalameer, Ahmed Khaldoon
    Alswaitti, Mohammed
    Alsudani, Ahmed Adnan
    Isa, Nor Ashidi Mat
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [43] Incorporating Negentropy in Saliency-based Search Free Car Number Plate Localization
    Safaei, Amin
    Tang, Hongying L.
    Sanei, Saeid
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 667 - 671
  • [44] Enhanced Dark Block Extraction Method Performed Automatically to Determine the Number of Clusters in Unlabeled Data Sets
    Prabhu, P.
    Duraiswamy, K.
    [J]. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2013, 8 (02) : 275 - 293
  • [45] Document clustering into an unknown number of clusters using a genetic algorithm
    Casillas, A
    de Lena, MTG
    Martínez, R
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 43 - 49
  • [46] Detecting the number of clusters using a support vector machine approach
    Moguerza, JM
    Muñoz, A
    Martín-Merino, M
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 763 - 768
  • [47] Determining the number of clusters using information entropy for mixed data
    Liang, Jiye
    Zhao, Xingwang
    Li, Deyu
    Cao, Fuyuan
    Dang, Chuangyin
    [J]. PATTERN RECOGNITION, 2012, 45 (06) : 2251 - 2265
  • [48] Determining the optimal number of clusters using a new evolutionary algorithm
    Lu, W
    Traore, I
    [J]. ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 712 - 713
  • [49] An indicator for the number of clusters: Using a linear map to simplex structure
    Weber, M
    Rungsarityotin, W
    Schliep, A
    [J]. FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 103 - +
  • [50] Determination of the optimal number of clusters using a spectral clustering optimization
    Mur, Angel
    Dormido, Raquel
    Duro, Natividad
    Dormido-Canto, Sebastian
    Vega, Jesus
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 65 : 304 - 314