Comparing the performance of biomedical clustering methods

被引:0
|
作者
Wiwie, Christian [1 ]
Baumbach, Jan [1 ,2 ,3 ]
Rottger, Richard [1 ]
机构
[1] Univ Southern Denmark, Dept Math & Comp Sci, Odense, Denmark
[2] Max Planck Inst Informat, Computat Syst Biol, D-66123 Saarbrucken, Germany
[3] Univ Saarland, Cluster Excellence Multimodal Comp & Interact, D-66123 Saarbrucken, Germany
关键词
PROTEIN-INTERACTION NETWORKS; GENE-EXPRESSION DATA; MICROARRAY DATA; AUTOMATED-METHOD; ALGORITHMS; COMPLEXES; DISCOVERY; DATABASE; MODEL;
D O I
10.1038/NMETH.3583
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.
引用
收藏
页码:1033 / 1038
页数:6
相关论文
共 50 条
  • [1] Comparing the performance of biomedical clustering methods
    Christian Wiwie
    Jan Baumbach
    Richard Röttger
    Nature Methods, 2015, 12 : 1033 - 1038
  • [2] The Similarity Plot for Comparing Clustering Methods
    Jang, Dae-Heung
    KOREAN JOURNAL OF APPLIED STATISTICS, 2013, 26 (02) : 361 - 373
  • [3] Comparing Methods for Analysis of Biomedical Hyperspectral Image Data
    Leavesley, Silas J.
    Sweat, Brenner
    Abbott, Caitlyn
    Favreau, Peter F.
    Annamdevula, Naga S.
    Rich, Thomas C.
    IMAGING, MANIPULATION, AND ANALYSIS OF BIOMOLECULES, CELLS, AND TISSUES XV, 2017, 10068
  • [4] Comparing methods for drug–gene interaction prediction on the biomedical literature knowledge graph: performance versus explainability
    Fotis Aisopos
    Georgios Paliouras
    BMC Bioinformatics, 24
  • [5] Comparing clustering methods for database categorization in image retrieval
    Käster, T
    Wendt, V
    Sagerer, G
    PATTERN RECOGNITION, PROCEEDINGS, 2003, 2781 : 228 - 235
  • [6] COMPARING INITIALISATION METHODS FOR THE HEURISTIC MEMETIC CLUSTERING ALGORITHM
    Craenen, B. G. W.
    Ristaniemi, T.
    Nandi, A. K.
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1158 - 1162
  • [7] Comparing methods for drug-gene interaction prediction on the biomedical literature knowledge graph: performance versus explainability
    Aisopos, Fotis
    Paliouras, Georgios
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [8] Comparing Different Methods of Agglomerative Hierarchical Clustering with Pairwise Constraints
    Takumi, Satoshi
    Miyamoto, Sadaaki
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1545 - 1550
  • [9] Comparing Semi-Automated Clustering Methods for Persona Development
    Brickey, Jonalan
    Walczak, Steven
    Burgess, Tony
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2012, 38 (03) : 537 - 546
  • [10] Comparing non-parametric ensemble methods for document clustering
    Gonzalez, Edgar
    Turmo, Jordi
    NATURAL LANGUAGE AND INFORMATION SYSTEMS, PROCEEDINGS, 2008, 5039 : 245 - 256