Principal component analysis for predicting transcription-factor binding motifs from array-derived data

被引:8
|
作者
Liu, YL
Vincenti, MP
Yokota, H [1 ]
机构
[1] Indiana Univ Purdue Univ, Dept Biomed Engn, Indianapolis, IN 46202 USA
[2] Purdue Univ, Weldon Sch Biomed Engn, W Lafayette, IN 47907 USA
[3] Indiana Univ Purdue Univ, Dept Anat & Cell Biol, Indianapolis, IN 46202 USA
[4] Dept Vet Affairs, White River Jct, VT 05009 USA
[5] Dartmouth Coll Sch Med, Dept Med, Hanover, NH 03755 USA
关键词
D O I
10.1186/1471-2105-6-276
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs). In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD) is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. Results: The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC- 3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3') were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC- BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPAR., STAF, ROAZ, and NF kappa B, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. Conclusion: The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Genome-wide transcription factor binding site/promoter databases for the analysis of gene sets and co-occurrence of transcription factor binding motifs
    Veerla, Srinivas
    Ringner, Markus
    Hoglund, Mattias
    BMC GENOMICS, 2010, 11 : 145
  • [32] Parallel Factor Analysis of gait waveform data: A multimode extension of Principal Component Analysis
    Helwig, Nathaniel E.
    Hong, Sungjin
    Polk, John D.
    HUMAN MOVEMENT SCIENCE, 2012, 31 (03) : 630 - 648
  • [33] Classification of polymeric materials by evolving factor analysis and principal component analysis of thermochromatographic data
    Elomaa, M
    Lochmüller, CH
    Kudrjashova, M
    Kaljurand, M
    THERMOCHIMICA ACTA, 2000, 362 (1-2) : 137 - 144
  • [34] Exploration of Data Fusion Strategies Using Principal Component Analysis and Multiple Factor Analysis
    Mafata, Mpho
    Brand, Jeanne
    Kidd, Martin
    Medvedovici, Andrei
    Buica, Astrid
    BEVERAGES, 2022, 8 (04):
  • [35] Genomic analysis identifies a transcription-factor binding motif regulating expression of the alpha C protein in group B Streptococcus
    Klinzing, David C.
    Madoff, Lawrence C.
    Puopolo, Karen M.
    MICROBIAL PATHOGENESIS, 2009, 46 (06) : 315 - 320
  • [36] Collective Principal Component Analysis from Distributed, Heterogeneous Data
    Kargupta, Hillol
    Huang, Weiyun
    Sivakumar, Krishnamoorthy
    Park, Byung-Hoon
    Wang, Shuren
    LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 452 - 457
  • [37] DPRP: a database of phenotype-specific regulatory programs derived from transcription factor binding data
    Tzeng, David T. W.
    Tseng, Yu-Ting
    Ung, Matthew
    Liao, I-En
    Liu, Chun-Chi
    Cheng, Chao
    NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D178 - D183
  • [38] Temporal variations in ozone concentrations derived from Principal Component Analysis
    S. Yonemura
    S. Kawashima
    H. Matsueda
    Y. Sawa
    S. Inoue
    H. Tanimoto
    Theoretical and Applied Climatology, 2008, 92 : 47 - 58
  • [39] Temporal variations in ozone concentrations derived from Principal Component Analysis
    Yonemura, S.
    Kawashima, S.
    Matsueda, H.
    Sawa, Y.
    Inoue, S.
    Tanimoto, H.
    THEORETICAL AND APPLIED CLIMATOLOGY, 2008, 92 (1-2) : 47 - 58
  • [40] Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells
    Boeva, Valentina
    FRONTIERS IN GENETICS, 2016, 7