BertSNR: an interpretable deep learning framework for single-nucleotide resolution identification of transcription factor binding sites based on DNA language model

被引:0
|
作者
Luo, Hanyu [1 ,2 ]
Tang, Li [1 ]
Zeng, Min [1 ]
Yin, Rui [3 ]
Ding, Pingjian [4 ]
Luo, Lingyun [2 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, 932 South Lushan Rd, Changsha 410083, Hunan, Peoples R China
[2] Univ South China, Sch Comp Sci, 28 West Changsheng Rd, Hengyang 421001, Hunan, Peoples R China
[3] Univ Florida, Dept Hlth Outcome & Biomed Informat, Gainesville, FL 32611 USA
[4] Case Western Reserve Univ, Ctr Artificial Intelligence Drug Discovery, Sch Med, Cleveland, OH 44106 USA
基金
中国国家自然科学基金;
关键词
CHIP-SEQ; DATABASE; OCT4;
D O I
10.1093/bioinformatics/btae461
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Transcription factors are pivotal in the regulation of gene expression, and accurate identification of transcription factor binding sites (TFBSs) at high resolution is crucial for understanding the mechanisms underlying gene regulation. The task of identifying TFBSs from DNA sequences is a significant challenge in the field of computational biology today. To address this challenge, a variety of computational approaches have been developed. However, these methods face limitations in their ability to achieve high-resolution identification and often lack interpretability.Results We propose BertSNR, an interpretable deep learning framework for identifying TFBSs at single-nucleotide resolution. BertSNR integrates sequence-level and token-level information by multi-task learning based on pre-trained DNA language models. Benchmarking comparisons show that our BertSNR outperforms the existing state-of-the-art methods in TFBS predictions. Importantly, we enhanced the interpretability of the model through attentional weight visualization and motif analysis, and discovered the subtle relationship between attention weight and motif. Moreover, BertSNR effectively identifies TFBSs in promoter regions, facilitating the study of intricate gene regulation.Availability and implementation The BertSNR source code can be found at https://github.com/lhy0322/BertSNR. Graphical Abstract
引用
收藏
页数:10
相关论文
共 32 条
  • [1] A deep learning model for predicting transcription factor binding location at Single Nucleotide Resolution
    Salekin, Sirajul
    Zhang, Jianqiu
    Huang, Yufei
    2017 IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL & HEALTH INFORMATICS (BHI), 2017, : 57 - 60
  • [2] MLSNet: a deep learning model for predicting transcription factor binding sites
    Zhang, Yuchuan
    Wang, Zhikang
    Ge, Fang
    Wang, Xiaoyu
    Zhang, Yiwen
    Li, Shanshan
    Guo, Yuming
    Song, Jiangning
    Yu, Dong-Jun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [3] Prediction of Transcription Factor Binding Sites on Cell-Free DNA Based on Deep Learning
    Qi, Ting
    Zhou, Ying
    Sheng, Yuqi
    Li, Zhihui
    Yang, Yuwei
    Liu, Quanjun
    Ge, Qinyu
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (10) : 4002 - 4008
  • [4] Genome-wide identification of Bacillus subtilis CodY-binding sites at single-nucleotide resolution
    Belitsky, Boris R.
    Sonenshein, Abraham L.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (17) : 7026 - 7031
  • [5] Prediction of Protein-DNA Binding Sites Based on Protein Language Model and Deep Learning
    Shan, Kaixuan
    Zhang, Xiankun
    Song, Chen
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT II, ICIC 2024, 2024, 14882 : 314 - 325
  • [6] Base-resolution prediction of transcription factor binding signals by a deep learning framework
    Zhang, Qinhu
    He, Ying
    Wang, Siguo
    Chen, Zhanheng
    Guo, Zhenhao
    Cui, Zhen
    Liu, Qi
    Huang, De-Shuang
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (03)
  • [7] A Review About Transcription Factor Binding Sites Prediction Based on Deep Learning
    Zeng, Yuanqi
    Gong, Meiqin
    Lin, Meng
    Gao, Dongrui
    Zhang, Yongqing
    IEEE ACCESS, 2020, 8 : 219256 - 219274
  • [8] Single-Nucleotide Mutation Matrix: A New Model for Predicting the NF-κB DNA Binding Sites
    Du, Wenxin
    Gao, Jing
    Wang, Tingting
    Wang, Jinke
    PLOS ONE, 2014, 9 (07):
  • [9] FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data
    Quang, Daniel
    Xie, Xiaohui
    METHODS, 2019, 166 : 40 - 47
  • [10] Interpretable single-cell transcription factor prediction based on deep learning with attention mechanism
    Gong, Meiqin
    He, Yuchen
    Wang, Maocheng
    Zhang, Yongqing
    Ding, Chunli
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2023, 106