Prediction of Transcription Factor Binding Sites Using a Combined Deep Learning Approach

被引:5
|
作者
Cao, Linan [1 ]
Liu, Pei [1 ]
Chen, Jialong [1 ]
Deng, Lei [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
来源
FRONTIERS IN ONCOLOGY | 2022年 / 12卷
基金
中国国家自然科学基金;
关键词
transcription factor binding sites; attention mechanism; positional embedding; deep learning; DNA; REPRESENTATION; SEQUENCES;
D O I
10.3389/fonc.2022.893520
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
In the process of regulating gene expression and evolution, such as DNA replication and mRNA transcription, the binding of transcription factors (TFs) to TF binding sites (TFBS) plays a vital role. Precisely modeling the specificity of genes and searching for TFBS are helpful to explore the mechanism of cell expression. In recent years, computational and deep learning methods searching for TFBS have become an active field of research. However, existing methods generally cannot meet high performance and interpretability simultaneously. Here, we develop an accurate and interpretable attention-based hybrid approach, DeepARC, that combines a convolutional neural network (CNN) and recurrent neural network (RNN) to predict TFBS. DeepARC employs a positional embedding method to extract the hidden embedding from DNA sequences, including the positional information from OneHot encoding and the distributed embedding from DNA2Vec. DeepARC feeds the positional embedding of the DNA sequence into a CNN-BiLSTM-Attention-based framework to complete the task of finding the motif. Taking advantage of the attention mechanism, DeepARC can gain greater access to valuable information about the motif and bring interpretability to the work of searching for motifs through the attention weight graph. Moreover, DeepARC achieves promising performances with an average area under the receiver operating characteristic curve (AUC) score of 0.908 on five cell lines (A549, GM12878, Hep-G2, H1-hESC, and Hela) in the benchmark dataset. We also compare the positional embedding with OneHot and DNA2Vec and gain a competitive advantage.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A Review About Transcription Factor Binding Sites Prediction Based on Deep Learning
    Zeng, Yuanqi
    Gong, Meiqin
    Lin, Meng
    Gao, Dongrui
    Zhang, Yongqing
    [J]. IEEE ACCESS, 2020, 8 : 219256 - 219274
  • [2] Deep learning for inferring transcription factor binding sites
    Koo, Peter K.
    Ploenzke, Matt
    [J]. CURRENT OPINION IN SYSTEMS BIOLOGY, 2020, 19 : 16 - 23
  • [3] Predicting Transcription Factor Binding Sites with Deep Learning
    Ghosh, Nimisha
    Santoni, Daniele
    Saha, Indrajit
    Felici, Giovanni
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (09)
  • [4] A combined approach to the identification of transcription factor binding sites in prokaryotes
    Dai, HK
    Zhao, L
    [J]. Proceedings of the 8th Joint Conference on Information Sciences, Vols 1-3, 2005, : 1217 - 1220
  • [5] Prediction of the transcription factor binding sites with meta-learning
    Jing, Fang
    Zhang, Shao-Wu
    Zhang, Shihua
    [J]. METHODS, 2022, 203 : 207 - 213
  • [6] Prediction of Transcription Factor Binding Sites on Cell-Free DNA Based on Deep Learning
    Qi, Ting
    Zhou, Ying
    Sheng, Yuqi
    Li, Zhihui
    Yang, Yuwei
    Liu, Quanjun
    Ge, Qinyu
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (10) : 4002 - 4008
  • [7] Prediction of transcription factor binding sites using genetic algorithm
    Chang, Xiaoyu
    Zhou, Wengang
    Zhou, Chunguang
    Liang, Yanchun
    [J]. 2006 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, 2006, : 932 - +
  • [8] Prediction of transcription factor binding sites using genetic algorithm
    Chang, Xiaoyu
    Zhou, Wengang
    Zhou, Chunguang
    Liang, Yanchun
    [J]. ICIEA 2006: 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, PROCEEDINGS, 2006, : 430 - 433
  • [9] MLSNet: a deep learning model for predicting transcription factor binding sites
    Zhang, Yuchuan
    Wang, Zhikang
    Ge, Fang
    Wang, Xiaoyu
    Zhang, Yiwen
    Li, Shanshan
    Guo, Yuming
    Song, Jiangning
    Yu, Dong-Jun
    [J]. BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [10] Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning
    Jing, Fang
    Zhang, Shao-Wu
    Cao, Zhen
    Zhang, Shihua l
    [J]. BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 241 - 252