An efficient method to transcription factor binding sites imputation via simultaneous completion of multiple matrices with positional consistency

被引:17
|
作者
Guo, Wei-Li [1 ]
Huang, De-Shuang [1 ]
机构
[1] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
CHIP-SEQ; DNA-BINDING; ENCODE; DISCOVERY; NETWORKS; MOTIFS;
D O I
10.1039/c7mb00155j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transcription factors (TFs) are DNA-binding proteins that have a central role in regulating gene expression. Identification of DNA-binding sites of TFs is a key task in understanding transcriptional regulation, cellular processes and disease. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) enables genome-wide identification of in vivo TF binding sites. However, it is still difficult to map every TF in every cell line owing to cost and biological material availability, which poses an enormous obstacle for integrated analysis of gene regulation. To address this problem, we propose a novel computational approach, TFBSImpute, for predicting additional TF binding profiles by leveraging information from available ChIP-seq TF binding data. TFBSImpute fuses the dataset to a 3-mode tensor and imputes missing TF binding signals via simultaneous completion of multiple TF binding matrices with positional consistency. We show that signals predicted by our method achieve overall similarity with experimental data and that TFBSImpute significantly outperforms baseline approaches, by assessing the performance of imputation methods against observed ChIP-seq TF binding profiles. Besides, motif analysis shows that TFBSImpute preforms better in capturing binding motifs enriched in observed data compared with baselines, indicating that the higher performance of TFBSImpute is not simply due to averaging related samples. We anticipate that our approach will constitute a useful complement to experimental mapping of TF binding, which is beneficial for further study of regulation mechanisms and disease.
引用
收藏
页码:1827 / 1837
页数:11
相关论文
共 50 条
  • [21] De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
    Keilwagen, Jens
    Grau, Jan
    Paponov, Ivan A.
    Posch, Stefan
    Strickert, Marc
    Grosse, Ivo
    PLOS COMPUTATIONAL BIOLOGY, 2011, 7 (02)
  • [22] Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm
    Zhu, Z
    Pilpel, Y
    Church, GM
    JOURNAL OF MOLECULAR BIOLOGY, 2002, 318 (01) : 71 - 81
  • [23] Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites
    Nak-Kyeong Kim
    Kannan Tharakaraman
    Leonardo Mariño-Ramírez
    John L Spouge
    BMC Bioinformatics, 9
  • [24] Finding sequence motifs with Bayesian models incorporating positional information:: an application to transcription factor binding sites
    Kim, Nak-Kyeong
    Tharakaraman, Kannan
    Marino-Ramirez, Leonardo
    Spouge, John L.
    BMC BIOINFORMATICS, 2008, 9 (1)
  • [25] Computational framework for the prediction of transcription factor binding sites by multiple data integration
    Ambesi-Impiombato, Alberto
    Bansal, Mukesh
    Lio, Pietro
    di Bernardo, Diego
    BMC NEUROSCIENCE, 2006, 7 (Suppl 1)
  • [26] Computational framework for the prediction of transcription factor binding sites by multiple data integration
    Alberto Ambesi-Impiombato
    Mukesh Bansal
    Pietro Liò
    Diego di Bernardo
    BMC Neuroscience, 7
  • [27] A multiple-feature framework for modelling and predicting transcription factor binding sites
    Pudimat, R
    Schukat-Talamazzini, EG
    Backofen, R
    BIOINFORMATICS, 2005, 21 (14) : 3082 - 3088
  • [28] Dinucleotide Weight Matrices for Predicting Transcription Factor Binding Sites: Generalizing the Position Weight Matrix
    Siddharthan, Rahul
    PLOS ONE, 2010, 5 (03):
  • [29] Assessment of Algorithms for Inferring Positional Weight Matrix Motifs of Transcription Factor Binding Sites Using Protein Binding Microarray Data
    Orenstein, Yaron
    Linhart, Chaim
    Shamir, Ron
    PLOS ONE, 2012, 7 (09):
  • [30] POSITIONAL AND SPATIAL REQUIREMENTS OF TRANSCRIPTION FACTOR BINDING-SITES WITHIN THE KAPPA-IMMUNOGLOBULIN INTRON ENHANCER
    SCHANKE, JT
    VANNESS, BG
    JOURNAL OF CELLULAR BIOCHEMISTRY, 1993, : 246 - 246