Iterative signature algorithm for the analysis of large-scale gene expression data

被引:269
|
作者
Bergmann, S [1 ]
Ihmels, J [1 ]
Barkai, N [1 ]
机构
[1] Weizmann Inst Sci, Dept Mol Genet, IL-76100 Rehovot, Israel
来源
PHYSICAL REVIEW E | 2003年 / 67卷 / 03期
关键词
D O I
10.1103/PhysRevE.67.031902
中图分类号
O35 [流体力学]; O53 [等离子体物理学];
学科分类号
070204 ; 080103 ; 080704 ;
摘要
We present an approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, which searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of singular value decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast Saccharomyces cerevisiae.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Analysis of large-scale gene expression data
    Sherlock, G
    [J]. CURRENT OPINION IN IMMUNOLOGY, 2000, 12 (02) : 201 - 205
  • [2] Performance Analysis of Gene Expression data using Biclustering Iterative Signature Algorithm
    Vengatesan, K.
    Singh, R. P.
    Bhaskar, Mahajan Sagar
    Padmanaban, Sanjeevikumar
    Ravishankar, T. Nadana
    Ramkumar, M.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 7 - 11
  • [3] Large-Scale Analysis of Gene Expression Data Reveals a Novel Gene Expression Signature Associated with Colorectal Cancer Distant Recurrence
    Alajez, Nehad M.
    [J]. PLOS ONE, 2016, 11 (12):
  • [4] Challenges and prospects in the analysis of large-scale gene expression data
    Ihmeis, JH
    Bergmann, S
    [J]. BRIEFINGS IN BIOINFORMATICS, 2004, 5 (04) : 313 - 327
  • [5] The HaLoop approach to large-scale iterative data analysis
    Bu, Yingyi
    Howe, Bill
    Balazinska, Magdalena
    Ernst, Michael D.
    [J]. VLDB JOURNAL, 2012, 21 (02): : 169 - 190
  • [6] The HaLoop approach to large-scale iterative data analysis
    Yingyi Bu
    Bill Howe
    Magdalena Balazinska
    Michael D. Ernst
    [J]. The VLDB Journal, 2012, 21 : 169 - 190
  • [7] Exploiting Scientific Workflows for Large-scale Gene Expression Data Analysis
    De Stasio, Alessandro
    Ertelt, Marcus
    Kemmner, Wolfgang
    Leser, Ulf
    Ceccarelli, Michele
    [J]. 2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 447 - +
  • [8] Interactive visualization of large-scale gene expression data
    Riveiro, Maria
    Lebram, Mikael
    Andersson, Christian X.
    Sartipy, Peter
    Synnergren, Jane
    [J]. PROCEEDINGS 2016 20TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION IV 2016, 2016, : 348 - 354
  • [9] SCANPY: large-scale single-cell gene expression data analysis
    F. Alexander Wolf
    Philipp Angerer
    Fabian J. Theis
    [J]. Genome Biology, 19
  • [10] SCANPY: large-scale single-cell gene expression data analysis
    Wolf, F. Alexander
    Angerer, Philipp
    Theis, Fabian J.
    [J]. GENOME BIOLOGY, 2018, 19