Transcriptome network component analysis with limited microarray data

被引:43
|
作者
Galbraith, Simon J.
Tran, Linh M.
Liao, James C. [1 ]
机构
[1] Univ Calif Los Angeles, Dept Chem Engn, Los Angeles, CA 90024 USA
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/btl279
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Network component analysis (NCA) is a method to deduce transcription factor (TF) activities and TF-gene regulation control strengths from gene expression data and a TF-gene binding connectivity network. Previously, this method could analyze a maximum number of regulators equal to the total sample size because of the identifiability limit in data decomposition. As such, the total number of source signal components was limited to the total number of experiments rather than the total number of biological regulators. However, networks that have less transcriptome data points than the number of regulators are of interest. Thus it is imperative to develop a theoretical basis that allows realistic source signal extraction based on relatively few data points. On the other hand, such methods would inherently increase numerical challenges leading to multiple solutions. Therefore, solutions to both the problems are needed. Results: We have improved NCA for transcription factor activity (TFA) estimation, based on the observation that most genes are regulated by only a few TFs. This observation leads to the derivation of a new identifiability criterion which is tested during numerical iteration that allows us to decompose data when the number of TFs is greater than the number of experiments. To show that our method works with real microarray data and has biological utility, we analyze Saccharomyces cerevisiae cell cycle microarray data (73 experiments) using a TF-gene connectivity network (96 TFs) derived from ChIP-chip binding data. We compare the results of NCA analysis with the results obtained from ChIP-chip regression methods, and we show that NCA and regression produce TFAs that are qualitatively similar, but the NCA TFAs outperform regression in statistical tests. We also show that NCA can extract subtle TFA signals that correlate with known cell cycle TF function and cell cycle phase. Overall we determined that 31 TFs have statistically periodic TFAs in one or more experiments, 75% of which are known cell cycle regulators. In addition, we find that the 12 TFAs that are periodic in two or more experiments correspond to well-known cell cycle regulators. We also investigated TFA sensitivity to the choice of connectivity network we constructed two networks using different ChIP-chip p-value cut-offs.
引用
收藏
页码:1886 / 1894
页数:9
相关论文
共 50 条
  • [1] Fast network component analysis (FastNCA) for gene regulatory network reconstruction from microarray data
    Chang, Chunqi
    Ding, Zhi
    Hung, Yeung Sam
    Fung, Peter Chin Wan
    [J]. BIOINFORMATICS, 2008, 24 (11) : 1349 - 1358
  • [2] Independent component analysis algorithms for microarray data analysis
    Malutan, Raul
    Gomez Vilda, Pedro
    Borda, Monica
    [J]. INTELLIGENT DATA ANALYSIS, 2010, 14 (02) : 193 - 206
  • [3] Application of independent component analysis to microarray data
    Suri, RE
    [J]. INTERNATIONAL CONFERENCE ON INTEGRATION OF KNOWLEDGE INTENSIVE MULTI-AGENT SYSTEMS: KIMAS'03: MODELING, EXPLORATION, AND ENGINEERING, 2003, : 375 - 378
  • [4] Penalized Principal Component Analysis of Microarray Data
    Nikulin, Vladimir
    McLachlan, Geoffrey J.
    [J]. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, 2010, 6160 : 82 - 96
  • [5] Component retention in principal component analysis with application to cDNA microarray data
    Richard Cangelosi
    Alain Goriely
    [J]. Biology Direct, 2
  • [6] Component retention in principal component analysis with application to cDNA microarray data
    Cangelosi, Richard
    Goriely, Alain
    [J]. BIOLOGY DIRECT, 2007, 2 (1)
  • [7] Robust processing of microarray data by independent component analysis
    Díaz, F
    Malutan, R
    Gómez, P
    Rodellar, V
    Puntonet, CG
    [J]. COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS, 2005, 3512 : 1051 - 1058
  • [8] Gene selection for microarray data analysis using principal component analysis
    Wang, AT
    Gehan, EA
    [J]. STATISTICS IN MEDICINE, 2005, 24 (13) : 2069 - 2087
  • [9] Independent component analysis of microarray data in the study of endometrial cancer
    Saidi, SA
    Holland, CM
    Kreil, DP
    MacKay, DJC
    Charnock-Jones, DS
    Print, CG
    Smith, SK
    [J]. ONCOGENE, 2004, 23 (39) : 6677 - 6683
  • [10] Independent component analysis of microarray data in the study of endometrial cancer
    Samir A Saidi
    Cathrine M Holland
    David P Kreil
    David J C MacKay
    D Stephen Charnock-Jones
    Cristin G Print
    Stephen K Smith
    [J]. Oncogene, 2004, 23 : 6677 - 6683