A comparison of four clustering methods for brain expression microarray data

被引:21
|
作者
Richards, Alexander L. [1 ]
Holmans, Peter [1 ]
O'Donovan, Michael C. [1 ]
Owen, Michael J. [1 ]
Jones, Lesley [1 ]
机构
[1] Univ Wales Hosp, Dept Psychol Med, Sch Med, Cardiff CF14 4XN, S Glam, Wales
关键词
D O I
10.1186/1471-2105-9-490
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: DNA microarrays, which determine the expression levels of tens of thousands of genes from a sample, are an important research tool. However, the volume of data they produce can be an obstacle to interpretation of the results. Clustering the genes on the basis of similarity of their expression profiles can simplify the data, and potentially provides an important source of biological inference, but these methods have not been tested systematically on datasets from complex human tissues. In this paper, four clustering methods, CRC, k-means, ISA and memISA, are used upon three brain expression datasets. The results are compared on speed, gene coverage and GO enrichment. The effects of combining the clusters produced by each method are also assessed. Results: k-means outperforms the other methods, with 100% gene coverage and GO enrichments only slightly exceeded by memISA and ISA. Those two methods produce greater GO enrichments on the datasets used, but at the cost of much lower gene coverage, fewer clusters produced, and speed. The clusters they find are largely different to those produced by k-means. Combining clusters produced by k-means and memISA or ISA leads to increased GO enrichment and number of clusters produced ( compared to k-means alone), without negatively impacting gene coverage. memISA can also find potentially disease-related clusters. In two independent dorsolateral prefrontal cortex datasets, it finds three overlapping clusters that are either enriched for genes associated with schizophrenia, genes differentially expressed in schizophrenia, or both. Two of these clusters are enriched for genes of the MAP kinase pathway, suggesting a possible role for this pathway in the aetiology of schizophrenia. Conclusion: Considered alone, k-means clustering is the most effective of the four methods on typical microarray brain expression datasets. However, memISA and ISA can add extra high-quality clusters to the set produced by k-means, so combining these three methods is the method of choice.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] Evaluation and comparison of gene clustering methods in microarray analysis
    Thalamuthu, Anbupalam
    Mukhopadhyay, Indranil
    Zheng, Xiaojing
    Tseng, George C.
    BIOINFORMATICS, 2006, 22 (19) : 2405 - 2412
  • [12] Clustering of Association Rules on Microarray Gene Expression Data
    Alagukumar, S.
    Vanitha, C. Devi Arockia
    Lawrance, R.
    ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 85 - 97
  • [13] Analysing microarray expression data through effective clustering
    Masciari, E.
    Mazzeo, G. M.
    Zaniolo, C.
    INFORMATION SCIENCES, 2014, 262 : 32 - 45
  • [14] Effective Clustering of Microarray Gene Expression Data using Signal Processing and Soft Computing Methods
    Mishra, Purnendu
    Bhoi, Nilamani
    Meher, Jayakishan
    2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, SIGNALS, COMMUNICATION AND OPTIMIZATION (EESCO), 2015,
  • [15] The comparison of different normalization methods in microarray data
    Tan Xiao-Jun
    Zhang Yong-Xin
    Qian Min-Ping
    Zhang You-Yi
    Deng Ming-Hua
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2007, 34 (06) : 625 - 633
  • [16] Methods to bicluster validation and comparison in microarray data
    Santamaria, Rodrigo
    Quintales, Luis
    Theron, Roberto
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2007, 2007, 4881 : 780 - 789
  • [17] Clustering algorithms and other exploratory methods for microarray data analysis
    Rahnenführer, J
    METHODS OF INFORMATION IN MEDICINE, 2005, 44 (03) : 444 - 448
  • [18] Clustering microarray data
    Gollub, Jeremy
    Sherlock, Gavin
    DNA MICROARRAYS, PART B: DATABASES AND STATISTICS, 2006, 411 : 194 - +
  • [19] Clustering analysis of microarray gene expression data with new clustering ensemble method
    Luo, Fei
    Liu, Juan
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 500 - 504
  • [20] On Comparison of Clustering Methods for Pharmacoepidemiological Data
    Feuillet, Fanny
    Bellanger, Lise
    Hardouin, Jean-Benoit
    Victorri-Vigneau, Caroline
    Sebille, Veronique
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2015, 25 (04) : 843 - 856