Optimized leaf ordering with class labels for hierarchical clustering

被引:3
|
作者
Novoselova, Natalia [1 ]
Wang, Junxi [2 ]
Klawonn, Frank [2 ,3 ]
机构
[1] United Inst Informat Problems, Dept Bioinformat, Surganova Str 6, Minsk 220012, BELARUS
[2] Helmholtz Ctr Infect Res, Biostat, D-38124 Braunschweig, Germany
[3] Ostfalia Univ Appl Sci, Dept Comp Sci, D-38302 Wolfenbuttel, Germany
关键词
Hierarchical clustering; dendrogram; leaf ordering; dynamic programming; biomedical data;
D O I
10.1142/S0219720015500122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Hierarchical clustering is extensively used in the bioinformatics community to analyze biomedical data. These data are often tagged with class labels, as e.g. disease subtypes or gene ontology (GO) terms. Heatmaps in connection with dendrograms are the common standard to visualize results of hierarchical clustering. The heatmap can be enriched by an additional color bar at the side, indicating for each instance in the data set to which class it belongs. In the ideal case, when the clustering matches perfectly with the classes, one would expect that instances from the same class cluster together and the color bar consists of well-separated color blocks without frequent alteration of colors (classes). But even in the case when instances from the same class cluster perfectly together, the dendrogram might not reflect this important aspect due to the fact that its representation is not unique. In this paper, we propose a leaf ordering algorithm for the dendrogram that preserving the hierarchical clustering result tries to group instances from the same class together. It is based on the concept of dynamic programming which can efficiently compute the optimal or nearly optimal order, consistent with the structure of the tree.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] An efficient optimal leaf ordering for hierarchical clustering in microarray gene expression data analysis
    Zhang, JT
    Gruenwald, L
    15TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, : 396 - 400
  • [2] Ordering of categorical data in hierarchical clustering
    Kazimianec, Michail
    DATABASES AND INFORMATION SYSTEMS, 2008, : 401 - 404
  • [3] Hierarchical Agglomerative Clustering with Ordering Constraints
    Zhao, Haifeng
    Qi, ZiJie
    THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 195 - 199
  • [4] Optimized aggregation function in hierarchical clustering combination
    Rashedi, Elaheh
    Mirzaei, Abdolreza
    Rahmati, Mohammad
    INTELLIGENT DATA ANALYSIS, 2016, 20 (02) : 281 - 291
  • [5] Regression on imperfect class labels derived by unsupervised clustering
    Brondum, Rasmus Froberg
    Michaelsen, Thomas Yssing
    Bogsted, Martin
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (02) : 2012 - 2019
  • [6] The use of hierarchical clustering for the design of optimized monitoring networks
    Soares, Joana
    Makar, Paul Andrew
    Aklilu, Yayne
    Akingunola, Ayodeji
    ATMOSPHERIC CHEMISTRY AND PHYSICS, 2018, 18 (09) : 6543 - 6566
  • [7] Constructing a decision tree from data with hierarchical class labels
    Chen, Yen-Liang
    Hu, Hsiao-Wei
    Tang, Kwei
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 4838 - 4847
  • [8] Vector space model for patent documents with hierarchical class labels
    Chen, Yen-Liang
    Chiu, Yu-Ting
    JOURNAL OF INFORMATION SCIENCE, 2012, 38 (03) : 222 - 233
  • [9] Globally optimized fiber tracking and hierarchical clustering - a unified framework
    Wu, Xi
    Xie, Mingyuan
    Zhou, Jiliu
    Anderson, Adam W.
    Gore, John C.
    Ding, Zhaohua
    MAGNETIC RESONANCE IMAGING, 2012, 30 (04) : 485 - 495
  • [10] Modularizing Software Systems using PSO optimized Hierarchical Clustering
    Bishnoi, Monika
    Singh, Paramvir
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES IN INFORMATION AND COMMUNICATION TECHNOLOGIES (ICCTICT), 2016,