A comparison of four clustering methods for brain expression microarray data

被引:21
|
作者
Richards, Alexander L. [1 ]
Holmans, Peter [1 ]
O'Donovan, Michael C. [1 ]
Owen, Michael J. [1 ]
Jones, Lesley [1 ]
机构
[1] Univ Wales Hosp, Dept Psychol Med, Sch Med, Cardiff CF14 4XN, S Glam, Wales
关键词
D O I
10.1186/1471-2105-9-490
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: DNA microarrays, which determine the expression levels of tens of thousands of genes from a sample, are an important research tool. However, the volume of data they produce can be an obstacle to interpretation of the results. Clustering the genes on the basis of similarity of their expression profiles can simplify the data, and potentially provides an important source of biological inference, but these methods have not been tested systematically on datasets from complex human tissues. In this paper, four clustering methods, CRC, k-means, ISA and memISA, are used upon three brain expression datasets. The results are compared on speed, gene coverage and GO enrichment. The effects of combining the clusters produced by each method are also assessed. Results: k-means outperforms the other methods, with 100% gene coverage and GO enrichments only slightly exceeded by memISA and ISA. Those two methods produce greater GO enrichments on the datasets used, but at the cost of much lower gene coverage, fewer clusters produced, and speed. The clusters they find are largely different to those produced by k-means. Combining clusters produced by k-means and memISA or ISA leads to increased GO enrichment and number of clusters produced ( compared to k-means alone), without negatively impacting gene coverage. memISA can also find potentially disease-related clusters. In two independent dorsolateral prefrontal cortex datasets, it finds three overlapping clusters that are either enriched for genes associated with schizophrenia, genes differentially expressed in schizophrenia, or both. Two of these clusters are enriched for genes of the MAP kinase pathway, suggesting a possible role for this pathway in the aetiology of schizophrenia. Conclusion: Considered alone, k-means clustering is the most effective of the four methods on typical microarray brain expression datasets. However, memISA and ISA can add extra high-quality clusters to the set produced by k-means, so combining these three methods is the method of choice.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] A comparison of fuzzy clustering approaches for quantification of microarray gene expression
    Wang, Yu-Ping
    Gunampally, Maheswar
    Chen, Jie
    Bittel, Douglas
    Butler, Merlin G.
    Cai, Wei-Wen
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2008, 50 (03): : 305 - 320
  • [22] A Comparison of Fuzzy Clustering Approaches for Quantification of Microarray Gene Expression
    Yu-Ping Wang
    Maheswar Gunampally
    Jie Chen
    Douglas Bittel
    Merlin G. Butler
    Wei-Wen Cai
    Journal of Signal Processing Systems, 2008, 50 : 305 - 320
  • [23] Comparison of various statistical methods for identifying differential gene expression in replicated microarray data
    Kim, SY
    Lee, JW
    Sohn, IS
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2006, 15 (01) : 3 - 20
  • [24] Spectral pattern comparison methods for cancer classification based on microarray gene expression data
    Pham, Tuan D.
    Beck, Dominik
    Yan, Hong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2006, 53 (11) : 2425 - 2430
  • [25] An Experimental Study on Microarray Expression Data from Plants under Salt Stress by using Clustering Methods
    Fyad, Houda
    Barigou, Fatiha
    Bouamrane, Karim
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2020, 6 (02): : 38 - 47
  • [26] PARAMETRIC VALIDITY INDEX OF CLUSTERING FOR MICROARRAY GENE EXPRESSION DATA
    Fa, Rui
    Nandi, Asoke K.
    2011 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2011,
  • [27] Functional clustering of genes using microarray gene expression data
    Paul Spellman
    Audrey Gasch
    Michael Eisen
    Camilla Kao
    Patrick Brown
    David Botstein
    Nature Genetics, 1999, 23 (Suppl 3) : 75 - 75
  • [28] Kernel hierarchical gene clustering from microarray expression data
    Qin, J
    Lewis, DP
    Noble, WS
    BIOINFORMATICS, 2003, 19 (16) : 2097 - 2104
  • [29] Clustering gene expression signals from retinal microarray data
    Fleury, G
    Hero, A
    Yoshida, S
    Carter, T
    Barlow, C
    Swaroop, A
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4024 - 4027
  • [30] An evolutionary clustering algorithm for gene expression microarray data analysis
    Ma, Patrick C. H.
    Chan, Keith C. C.
    Yao, Xin
    Chiu, David K. Y.
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2006, 10 (03) : 296 - 314