Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach

被引:201
|
作者
Pihur, Vasyl [1 ]
Datta, Susmita [1 ]
Datta, Somnath [1 ]
机构
[1] Univ Louisville, Dept Bioinformat & Biostat, Louisville, KY 40202 USA
关键词
D O I
10.1093/bioinformatics/btm158
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Biologists often employ clustering techniques in the explorative phase of microarray data analysis to discover relevant biological groupings. Given the availability of numerous clustering algorithms in the machine-learning literature, an user might want to select one that performs the best for his/her data set or application. While various validation measures have been proposed over the years to judge the quality of clusters produced by a given clustering algorithm including their biological relevance, unfortunately, a given clustering algorithm can perform poorly under one validation measure while outperforming many other algorithms under another validation measure. A manual synthesis of results from multiple validation measures is nearly impossible in practice, especially, when a large number of clustering algorithms are to be compared using several measures. An automated and objective way of reconciling the rankings is needed. Results: Using a Monte Carlo cross-entropy algorithm, we successfully combine the ranks of a set of clustering algorithms under consideration via a weighted aggregation that optimizes a distance criterion. The proposed weighted rank aggregation allows for a far more objective and automated assessment of clustering results than a simple visual inspection. We illustrate our procedure using one simulated as well as three real gene expression data sets from various platforms where we rank a total of eleven clustering algorithms using a combined examination of 10 different validation measures. The aggregate rankings were found for a given number of clusters k and also for an entire range of k.
引用
收藏
页码:1607 / 1615
页数:9
相关论文
共 50 条
  • [31] The Consistency between Cross-Entropy and Distance Measures in Fuzzy Sets
    Wang, Yameng
    Yang, Han
    Qin, Keyun
    SYMMETRY-BASEL, 2019, 11 (03):
  • [32] Renyi Cross-Entropy Measures for Common Distributions and Processes with Memory
    Thierrin, Ferenc Cole
    Alajaji, Fady
    Linder, Tamas
    ENTROPY, 2022, 24 (10)
  • [33] A cross-entropy approach to solving Dec-POMDPs
    Oliehoek, Frans A.
    Kooij, Julian F. P.
    Vlassis, Nikos
    ADVANCES IN INTELLIGENT AND DISTRIBUTED COMPUTING, 2008, 78 : 145 - +
  • [34] Bayesian cross-entropy methodology for optimal design of validation experiments
    Jiang, X.
    Mahadevan, S.
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2006, 17 (07) : 1895 - 1908
  • [35] Learning from Imbalanced Data Sets with Weighted Cross-Entropy Function
    Aurelio, Yuri Sousa
    de Almeida, Gustavo Matheus
    de Castro, Cristiano Leite
    Braga, Antonio Padua
    NEURAL PROCESSING LETTERS, 2019, 50 (02) : 1937 - 1949
  • [36] PARAMETER ESTIMATION FOR ODES USING A CROSS-ENTROPY APPROACH
    Wang, Bo
    Enright, Wayne
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2013, 35 (06): : A2718 - A2737
  • [37] Learning from Imbalanced Data Sets with Weighted Cross-Entropy Function
    Yuri Sousa Aurelio
    Gustavo Matheus de Almeida
    Cristiano Leite de Castro
    Antonio Padua Braga
    Neural Processing Letters, 2019, 50 : 1937 - 1949
  • [38] A Cross-Entropy Approach to the Domination Problem and Its Variants
    Burdett, Ryan
    Haythorpe, Michael
    Newcombe, Alex
    ENTROPY, 2024, 26 (10)
  • [39] A conditional-logical approach to minimum cross-entropy
    Kern-Isberner, G
    STACS 97 - 14TH ANNUAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE, 1997, 1200 : 237 - 248
  • [40] Probabilistic linguistic decision-making based on the hybrid entropy and cross-entropy measures
    Fang, Bing
    FUZZY OPTIMIZATION AND DECISION MAKING, 2023, 22 (03) : 415 - 445