Identification of Regulatory Modules in Time Series Gene Expression Data Using a Linear Time Biclustering Algorithm

被引:69
|
作者
Madeira, Sara C. [1 ,2 ]
Teixeira, Miguel C. [3 ]
Sa-Correia, Isabel [3 ]
Oliveira, Arlindo L. [2 ]
机构
[1] Univ Beira Interior, Dept Informat, P-6201001 Covilha, Portugal
[2] INESC, Knowledge Discovery & Bioinformat KDBIO Grp, P-1000029 Lisbon, Portugal
[3] Univ Tecn Lisboa, Inst Super Tecn, Biol Sci Res Grp, Dept Engn Quim & Biol, P-1049001 Lisbon, Portugal
关键词
Biclustering; time series gene expression data; expression patterns; regulatory modules; CONSTRUCTION;
D O I
10.1109/TCBB.2008.34
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although most biclustering formulations are NP-hard, in time series expression data analysis, it is reasonable to restrict the problem to the identification of maximal biclusters with contiguous columns, which correspond to coherent expression patterns shared by a group of genes in consecutive time points. This restriction leads to a tractable problem. We propose an algorithm that finds and reports all maximal contiguous column coherent biclusters in time linear in the size of the expression matrix. The linear time complexity of CCC-Biclustering relies on the use of a discretized matrix and efficient string processing techniques based on suffix trees. We also propose a method for ranking biclusters based on their statistical significance and a methodology for filtering highly overlapping and, therefore, redundant biclusters. We report results in synthetic and real data showing the effectiveness of the approach and its relevance in the discovery of regulatory modules. Results obtained using the transcriptomic expression patterns occurring in Saccharomyces cerevisiae in response to heat stress show not only the ability of the proposed methodology to extract relevant information compatible with documented biological knowledge but also the utility of using this algorithm in the study of other environmental stresses and of regulatory modules in general.
引用
收藏
页码:153 / 165
页数:13
相关论文
共 50 条
  • [1] Identification of K-Tolerance Regulatory Modules in Time Series Gene Expression Data Using a Biclustering Algorithm
    Phukhachee, Tustanah
    Maneewongvatana, Songrit
    [J]. ACTIVE MEDIA TECHNOLOGY, AMT 2013, 2013, 8210 : 146 - 155
  • [2] A linear time biclustering algorithm for time series gene expression data
    Madeira, SC
    Oliveira, AL
    [J]. ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2005, 3692 : 39 - 52
  • [3] A New Biclustering Algorithm for Time-Series Gene Expression Data Analysis
    Xue, Yun
    Liao, Zhengling
    Li, Meihang
    Luo, Jie
    Hu, Xiaohui
    Luo, Guiyin
    Chen, Wen-Sheng
    [J]. 2014 TENTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2014, : 268 - 272
  • [4] A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series
    Madeira, Sara C.
    Oliveira, Arlindo L.
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2009, 4
  • [5] A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series
    Sara C Madeira
    Arlindo L Oliveira
    [J]. Algorithms for Molecular Biology, 4
  • [6] A contiguous column coherent evolution biclustering algorithm for time-series gene expression data
    Yun Xue
    Meizhen Zhang
    Zhengling Liao
    Meihang Li
    Jie Luo
    Xiaohui Hu
    [J]. International Journal of Machine Learning and Cybernetics, 2018, 9 : 441 - 453
  • [7] A contiguous column coherent evolution biclustering algorithm for time-series gene expression data
    Xue, Yun
    Zhang, Meizhen
    Liao, Zhengling
    Li, Meihang
    Luo, Jie
    Hu, Xiaohui
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (03) : 441 - 453
  • [8] Efficient Biclustering Algorithms for Time Series Gene Expression Data Analysis
    Madeira, Sara C.
    Oliveira, Arlindo L.
    [J]. DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS, 2009, 5518 : 1013 - 1019
  • [9] MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data
    Yang, Bei
    Xu, Yaohui
    Maxwell, Andrew
    Koh, Wonryull
    Gong, Ping
    Zhang, Chaoyang
    [J]. BMC SYSTEMS BIOLOGY, 2018, 12
  • [10] BiGGEsTS: Integrated environment for biclustering analysis of time series gene expression data
    Gonçalves J.P.
    Madeira S.C.
    Oliveira A.L.
    [J]. BMC Research Notes, 2 (1)