Predicting Transcription Factor Binding Sites with Deep Learning

被引:0
|
作者
Ghosh, Nimisha [1 ]
Santoni, Daniele [2 ]
Saha, Indrajit [3 ]
Felici, Giovanni [2 ]
机构
[1] Siksha O Anusandhan Univ, Inst Tech Educ & Res, Dept Comp Sci & Informat Technol, Bhubaneswar 751030, India
[2] Natl Res Council Italy, Inst Syst Anal & Comp Sci Antonio Ruberti, I-00185 Rome, Italy
[3] Natl Inst Tech Teachers Training & Res, Dept Comp Sci & Engn, Kolkata 700106, India
关键词
capsule network; deep learning; DNA sequences; transcription factor binding sites (TFBSs); CAPSULE NETWORK; DNA; PROTEINS; IDENTIFICATION; VARIANTS;
D O I
10.3390/ijms25094990
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Prediction of binding sites for transcription factors is important to understand how the latter regulate gene expression and how this regulation can be modulated for therapeutic purposes. A consistent number of references address this issue with different approaches, Machine Learning being one of the most successful. Nevertheless, we note that many such approaches fail to propose a robust and meaningful method to embed the genetic data under analysis. We try to overcome this problem by proposing a bidirectional transformer-based encoder, empowered by bidirectional long-short term memory layers and with a capsule layer responsible for the final prediction. To evaluate the efficiency of the proposed approach, we use benchmark ChIP-seq datasets of five cell lines available in the ENCODE repository (A549, GM12878, Hep-G2, H1-hESC, and Hela). The results show that the proposed method can predict TFBS within the five different cell lines very well; moreover, cross-cell predictions provide satisfactory results as well. Experiments conducted across cell lines are reinforced by the analysis of five additional lines used only to test the model trained using the others. The results confirm that prediction across cell lines remains very high, allowing an extensive cross-transcription factor analysis to be performed from which several indications of interest for molecular biology may be drawn.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Predicting transcription factor binding sites by a multi-modal representation learning method based on cross-attention network
    Wei, Yuxiao
    Zhang, Qi
    Liu, Liwei
    APPLIED SOFT COMPUTING, 2024, 166
  • [42] The Robustness and Evolvability of Transcription Factor Binding Sites
    Payne, Joshua L.
    Wagner, Andreas
    SCIENCE, 2014, 343 (6173) : 875 - 877
  • [43] Evolution of transcription factor DNA binding sites
    Kotelnikova, EA
    Makeev, VJ
    Gelfand, MS
    GENE, 2005, 347 (02) : 255 - 263
  • [44] Adaptive evolution of transcription factor binding sites
    Johannes Berg
    Stana Willmann
    Michael Lässig
    BMC Evolutionary Biology, 4
  • [45] Position dependencies in transcription factor binding sites
    Tomovic, Andrija
    Oakeley, Edward J.
    BIOINFORMATICS, 2007, 23 (08) : 933 - 941
  • [46] Parallel discovery of transcription factor binding sites
    Wirawan, Adrianto
    Schmidt, Bertil
    2006 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, 2006, : 700 - +
  • [48] Predicting Characteristics of the Potentially Binding Sites for miRNA in the mRNA of the TCP Transcription Factor Genes of Plants
    Rakhmetullina, A. K.
    Pyrkova, A. Yu
    Goncharova, A., V
    Ivashchenko, A. T.
    RUSSIAN JOURNAL OF PLANT PHYSIOLOGY, 2020, 67 (04) : 606 - 617
  • [49] Exploring potential target genes of signaling pathways by predicting conserved transcription factor binding sites
    Dieterich, C.
    Herwig, R.
    Vingron, M.
    BIOINFORMATICS, 2003, 19 : II50 - II56
  • [50] Predicting Characteristics of the Potentially Binding Sites for miRNA in the mRNA of the TCP Transcription Factor Genes of Plants
    A. K. Rakhmetullina
    A. Yu. Pyrkova
    A. V. Goncharova
    A. T. Ivashchenko
    Russian Journal of Plant Physiology, 2020, 67 : 606 - 617