Judging the quality of gene expression-based clustering methods using gene annotation

被引:220
|
作者
Gibbons, FD [1 ]
Roth, FP [1 ]
机构
[1] Harvard Univ, Sch Med, Dept Biol Chem & Mol Pharmacol, Boston, MA 02115 USA
关键词
D O I
10.1101/gr.397002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results.
引用
收藏
页码:1574 / 1581
页数:8
相关论文
共 50 条
  • [31] Clustering methods for microarray gene expression data
    Belacel, Nabil
    Wang, Qian
    Cuperlovic-Culf, Miroslava
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2006, 10 (04) : 507 - 531
  • [32] Problems in gene clustering based on gene expression data
    Bryan, J
    JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) : 44 - 66
  • [33] Predicting genotoxicity of viral vectors for stem cell gene therapy using gene expression-based machine learning
    Schwarzer, Adrian
    Talbot, Steven R.
    Selich, Anton
    Morgan, Michael
    Schott, Juliane W.
    Dittrich-Breiholz, Oliver
    Bastone, Antonella L.
    Weigel, Bettina
    Ha, Teng Cheong
    Dziadek, Violetta
    Gijsbers, Rik
    Thrasher, Adrian J.
    Staal, Frank J. T.
    Gaspar, Hubert B.
    Modlich, Ute
    Schambach, Axel
    Rothe, Michael
    MOLECULAR THERAPY, 2021, 29 (12) : 3383 - 3397
  • [34] Gene expression-based modeling of human cortical synaptic density
    Goyal, Manu S.
    Raichle, Marcus E.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (16) : 6571 - 6576
  • [35] Gene Expression-Based Classification of Paediatric Germ Cell Tumors
    Kubota, Y.
    Seki, M.
    Isobe, T.
    Yoshida, K.
    Sato, Y.
    Kataoka, K.
    Shiraishi, Y.
    Chiba, K.
    Tanaka, H.
    Hiwatari, M.
    Miyano, S.
    Hayashi, Y.
    Oka, A.
    Ogawa, S.
    Takita, J.
    PEDIATRIC BLOOD & CANCER, 2016, 63 : S26 - S26
  • [36] Gene Expression Analysis Using Clustering
    Dhiraj, Kumar
    Rath, Santanu Kumar
    Pandey, Abhishek
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 169 - 172
  • [37] Gene expression-based diagnosis of efficacy of chemotherapy for breast cancer
    Yoshio Miki
    Breast Cancer, 2010, 17 : 97 - 102
  • [38] Gene expression-based prognostic and predictive tools in breast cancer
    Gyöngyi Munkácsy
    Marcell A. Szász
    Otilia Menyhárt
    Breast Cancer, 2015, 22 : 245 - 252
  • [39] Identifying gene expression-based biomarkers in online learning environments
    Cattelani, Luca
    Fortino, Vittorio
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [40] Gene expression-based prognostic and predictive tools in breast cancer
    Munkacsy, Gyoengyi
    Szasz, Marcell A.
    Menyhart, Otilia
    BREAST CANCER, 2015, 22 (03) : 245 - 252