A comparative analysis of biclustering algorithms for gene expression data

被引:161
|
作者
Eren, Kemal [1 ]
Deveci, Mehmet [1 ]
Kucuktunc, Onur [1 ]
Catalyurek, Umit V. [2 ,3 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Biomed Informat, Columbus, OH 43210 USA
[3] Ohio State Univ, Dept Elect & Comp Engn, Columbus, OH 43210 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
biclustering; microarray; gene expression; clustering; MICROARRAY DATA; BIOCONDUCTOR; PATTERNS;
D O I
10.1093/bib/bbs032
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The need to analyze high-dimension biological data is driving the development of new data mining methods. Biclustering algorithms have been successfully applied to gene expression data to discover local patterns, in which a subset of genes exhibit similar expression levels over a subset of conditions. However, it is not clear which algorithms are best suited for this task. Many algorithms have been published in the past decade, most of which have been compared only to a small number of algorithms. Surveys and comparisons exist in the literature, but because of the large number and variety of biclustering algorithms, they are quickly outdated. In this article we partially address this problem of evaluating the strengths and weaknesses of existing biclustering methods. We used the BiBench package to compare 12 algorithms, many of which were recently published or have not been extensively studied. The algorithms were tested on a suite of synthetic data sets to measure their performance on data with varying conditions, such as different bicluster models, varying noise, varying numbers of biclusters and overlapping biclusters. The algorithms were also tested on eight large gene expression data sets obtained from the Gene Expression Omnibus. Gene Ontology enrichment analysis was performed on the resulting biclusters, and the best enrichment terms are reported. Our analyses show that the biclustering method and its parameters should be selected based on the desired model, whether that model allows overlapping biclusters, and its robustness to noise. In addition, we observe that the biclustering algorithms capable of finding more than one model are more successful at capturing biologically relevant clusters.
引用
收藏
页码:279 / 292
页数:14
相关论文
共 50 条
  • [1] On Evolutionary Algorithms for Biclustering of Gene Expression Data
    Carballido Jessica, A.
    Gallo Cristian, A.
    Dussaut Julieta, S.
    Ignacio, Ponzoni
    [J]. CURRENT BIOINFORMATICS, 2015, 10 (03) : 259 - 267
  • [2] Efficient Biclustering Algorithms for Time Series Gene Expression Data Analysis
    Madeira, Sara C.
    Oliveira, Arlindo L.
    [J]. DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS, 2009, 5518 : 1013 - 1019
  • [3] Comparison Analysis of Biclustering Algorithms with the Use of Artificial Data and Gene Expression Profiles
    Babichev, S.
    Lytvynenko, V.
    Voronenko, M.
    Osypenko, V.
    Korobchynskyi, M.
    [J]. 2018 IEEE 38TH INTERNATIONAL CONFERENCE ON ELECTRONICS AND NANOTECHNOLOGY (ELNANO), 2018, : 298 - 304
  • [4] Comparative Analysis and Evaluation of Biclustering Algorithms for Microarray Data
    Maind, Ankush
    Raut, Shital
    [J]. NETWORKING COMMUNICATION AND DATA KNOWLEDGE ENGINEERING, VOL 2, 2018, 4 : 159 - 171
  • [5] BICLUSTERING ANALYSIS OF GENE EXPRESSION DATA USING MULTI-OBJECTIVE EVOLUTIONARY ALGORITHMS
    Golchin, Maryam
    Davarpanah, Seyed Hashem
    Liew, Alan Wee-Chung
    [J]. PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 505 - 510
  • [6] JBiclustGE: Java']Java API with unified biclustering algorithms for gene expression data analysis
    Rocha, Orlando
    Mendes, Rui
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 155 : 83 - 87
  • [7] On Biclustering of Gene Expression Data
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    [J]. CURRENT BIOINFORMATICS, 2010, 5 (03) : 204 - 216
  • [8] On Biclustering of Gene Expression Data
    Mounir, Mahmoud
    Hamdy, Mohamed
    [J]. 2015 IEEE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INFORMATION SYSTEMS (ICICIS), 2015, : 641 - 648
  • [9] Biclustering On Gene Expression Data
    Shruthi, M. P.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [10] MSR-based algorithms for biclustering of microarray gene expression data
    Balamurugan, R.
    Raja, S. P.
    [J]. CURRENT SCIENCE, 2022, 123 (04): : 530 - 541