Judging the quality of gene expression-based clustering methods using gene annotation

被引:220
|
作者
Gibbons, FD [1 ]
Roth, FP [1 ]
机构
[1] Harvard Univ, Sch Med, Dept Biol Chem & Mol Pharmacol, Boston, MA 02115 USA
关键词
D O I
10.1101/gr.397002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results.
引用
收藏
页码:1574 / 1581
页数:8
相关论文
共 50 条
  • [41] Gene expression-based approaches to small molecule discovery for cancer
    Stegmaier, Kimberly
    CANCER RESEARCH, 2009, 69
  • [42] GOBO: Gene Expression-Based Outcome for Breast Cancer Online
    Ringner, Markus
    Fredlund, Erik
    Hakkinen, Jari
    Borg, Ake
    Staaf, Johan
    PLOS ONE, 2011, 6 (03):
  • [43] Relation of gene expression-based tumor subclasses to clinical phenotypes
    T Sørlie
    CM Perou
    PE Lønning
    PO Brown
    D Botstein
    A-L Børresen-Dale
    Breast Cancer Research, 2 (Suppl 1)
  • [44] Gene expression data clustering using a multiobjective symmetry based clustering technique
    Saha, Sriparna
    Ekbal, Asif
    Gupta, Kshitija
    Bandyopadhyay, Sanghamitra
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (11) : 1965 - 1977
  • [45] Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression
    Arnaiz, Olivier
    Van Dijk, Erwin
    Betermier, Mireille
    Lhuillier-Akakpo, Maoussi
    de Vanssay, Augustin
    Duharcourt, Sandra
    Sallet, Erika
    Gouzy, Jerome
    Sperling, Linda
    BMC GENOMICS, 2017, 18
  • [46] Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression
    Olivier Arnaiz
    Erwin Van Dijk
    Mireille Bétermier
    Maoussi Lhuillier-Akakpo
    Augustin de Vanssay
    Sandra Duharcourt
    Erika Sallet
    Jérôme Gouzy
    Linda Sperling
    BMC Genomics, 18
  • [47] A whole blood gene expression-based signature for smoking status
    Philip Beineke
    Karen Fitch
    Heng Tao
    Michael R Elashoff
    Steven Rosenberg
    William E Kraus
    James A Wingrove
    BMC Medical Genomics, 5
  • [48] iSyTE 2.0: a database for expression-based gene discovery in the eye
    Kakrana, Atul
    Yang, Andrian
    Anand, Deepti
    Djordjevic, Djordje
    Ramachandruni, Deepti
    Singh, Abhyudai
    Huang, Hongzhan
    Ho, Joshua W. K.
    Lachke, Salil A.
    NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D875 - D885
  • [49] Gene Expression-Based Biomarkers for Anopheles gambiae Age Grading
    Wang, Mei-Hui
    Marinotti, Osvaldo
    Zhong, Daibin
    James, Anthony A.
    Walker, Edward
    Guda, Tom
    Kweka, Eliningaya J.
    Githure, John
    Yan, Guiyun
    PLOS ONE, 2013, 8 (07):
  • [50] A gene expression-based immune signature for lung adenocarcinoma prognosis
    Wang, Lijuan
    Luo, Xizhi
    Cheng, Chao
    Amos, Christopher I.
    Cai, Guoshuai
    Xiao, Feifei
    CANCER IMMUNOLOGY IMMUNOTHERAPY, 2020, 69 (09) : 1881 - 1890