An efficient algorithm for improving structure-based prediction of transcription factor binding sites

被引:14
|
作者
Farrel, Alvin [1 ]
Guo, Jun-tao [1 ]
机构
[1] Univ North Carolina Charlotte, Dept Bioinformat & Genom, 9201 Univ City Blvd, Charlotte, NC 28223 USA
来源
BMC BIOINFORMATICS | 2017年 / 18卷
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Transcription factor binding site; Structure-based prediction; Binding motif; Integrative energy function; Fragment-based method; Pentamer; PROTEIN-DNA INTERACTIONS; NUCLEIC-ACID STRUCTURES; CATION-PI INTERACTIONS; ENERGY FUNCTION; ALL-ATOM; SPECIFICITY; DOCKING; SHAPE; VISUALIZATION; ULTRABITHORAX;
D O I
10.1186/s12859-017-1755-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Gene expression is regulated by transcription factors binding to specific target DNA sites. Understanding how and where transcription factors bind at genome scale represents an essential step toward our understanding of gene regulation networks. Previously we developed a structure-based method for prediction of transcription factor binding sites using an integrative energy function that combines a knowledge-based multibody potential and two atomic energy terms. While the method performs well, it is not computationally efficient due to the exponential increase in the number of binding sequences to be evaluated for longer binding sites. In this paper, we present an efficient pentamer algorithm by splitting DNA binding sequences into overlapping fragments along with a simplified integrative energy function for transcription factor binding site prediction. Results: A DNA binding sequence is split into overlapping pentamers (5 base pairs) for calculating transcription factor-pentamer interaction energy. To combine the results from overlapping pentamer scores, we developed two methods, Kmer-Sum and PWM (Position Weight Matrix) stacking, for full-length binding motif prediction. Our results show that both Kmer-Sum and PWM stacking in the new pentamer approach along with a simplified integrative energy function improved transcription factor binding site prediction accuracy and dramatically reduced computation time, especially for longer binding sites. Conclusion: Our new fragment-based pentamer algorithm and simplified energy function improve both efficiency and accuracy. To our knowledge, this is the first fragment-based method for structure-based transcription factor binding sites prediction.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] An efficient algorithm for improving structure-based prediction of transcription factor binding sites
    Alvin Farrel
    Jun-tao Guo
    [J]. BMC Bioinformatics, 18
  • [2] Structure-Based Prediction of Transcription Factor Binding Sites
    Guo, Jun-tao
    Lofgren, Shane
    Farrel, Alvin
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2014, 19 (06) : 568 - 577
  • [3] Structure-Based Prediction of Transcription Factor Binding Sites
    Jun-tao Guo
    Shane Lofgren
    Alvin Farrel
    [J]. Tsinghua Science and Technology, 2014, 19 (06) : 568 - 577
  • [4] Structure-based prediction of transcription factor binding sites using a protein-DNA docking approach
    Liu, Zhijie
    Guo, Jun-Tao
    Li, Ting
    Xu, Ying
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 72 (04) : 1114 - 1124
  • [5] Prediction of transcription factor binding sites using genetic algorithm
    Chang, Xiaoyu
    Zhou, Wengang
    Zhou, Chunguang
    Liang, Yanchun
    [J]. 2006 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, 2006, : 932 - +
  • [6] Prediction of transcription factor binding sites using genetic algorithm
    Chang, Xiaoyu
    Zhou, Wengang
    Zhou, Chunguang
    Liang, Yanchun
    [J]. ICIEA 2006: 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, PROCEEDINGS, 2006, : 430 - 433
  • [7] Structure-based prediction of transcription factor binding specificity using an integrative energy function
    Farrel, Alvin
    Murphy, Jonathan
    Guo, Jun-tao
    [J]. BIOINFORMATICS, 2016, 32 (12) : 306 - 313
  • [8] Transcription factor binding sites prediction based on sequence similarity
    Sim, Jeong Seop
    Park, Soo-Jun
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 1058 - 1061
  • [9] Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
    Talebzadeh, Mohammad
    Zare-Mirakabad, Fatemeh
    [J]. PLOS ONE, 2014, 9 (02):
  • [10] Prediction of Nucleosome Positioning Based on Transcription Factor Binding Sites
    Yi, Xianfu
    Cai, Yu-Dong
    He, Zhisong
    Cui, WeiRen
    Kong, Xiangyin
    [J]. PLOS ONE, 2010, 5 (09): : 1 - 7