Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT

被引:2
|
作者
Luo, Ye [1 ]
Chen, Yaowen [1 ]
Xie, HuanZeng [1 ]
Zhu, Wentao [1 ]
Zhang, Guishan [1 ]
机构
[1] Shantou Univ, Coll Engn, Shantou 515063, Peoples R China
基金
中国国家自然科学基金;
关键词
CRISPER/Cas9; Off-target; BERT; Adaptive batch-wise olass balancing; Deep learning; GENOME EDITING TECHNOLOGIES; CLASSIFICATION; CRISPR-CAS9; SPECIFICITY; DESIGN; CAS9; SYSTEMS; DNA;
D O I
10.1016/j.compbiomed.2024.107932
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Off-target effects of CRISPR/Cas9 can lead to suboptimal genome editing outcomes. Numerous deep learning-based approaches have achieved excellent performance for off-target prediction; however, few can predict the off-target activities with both mismatches and indels between single guide RNA (sgRNA) and target DNA sequence pair. In addition, data imbalance is a common pitfall for off-target prediction. Moreover, due to the complexity of genomic contexts, generating an interpretable model also remains challenged. To address these issues, firstly we developed a BERT-based model called CRISPR-BERT for enhancing the prediction of off-target activities with both mismatches and indels. Secondly, we proposed an adaptive batch-wise class balancing strategy to combat the noise exists in imbalanced off-target data. Finally, we applied a visualization approach for investigating the generalizable nucleotide position-dependent patterns of sgRNA-DNA pair for off-target activity. In our comprehensive comparison to existing methods on five mismatches-only datasets and two mismatches-and-indels datasets, CRISPR-BERT achieved the best performance in terms of AUROC and PRAUC. Besides, the visualization analysis demonstrated how implicit knowledge learned by CRISPR-BERT facilitates off-target prediction, which shows potential in model interpretability. Collectively, CRISPR-BERT provides an accurate and interpretable framework for off-target prediction, further contributes to sgRNA optimization in practical use for improved target specificity in CRISPR/Cas9 genome editing. The source code is available at https://github.com/BrokenStringx/CRISPR-BERT
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Prediction of CRISPR-Cas9 off-target activities with mismatches and indels based on hybrid neural network
    Yang, Yanpeng
    Li, Jian
    Zou, Quan
    Ruan, Yaoping
    Feng, Hailin
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 5039 - 5048
  • [2] Off-target Effect of CRISPR/Cas9 and Optimization
    Guo Quan-Juan
    Han Qiu-Ju
    Zhang Jian
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2018, 45 (08) : 798 - 807
  • [3] Prediction of Off-Target Effects in CRISPR/Cas9 System by Ensemble Learning
    Fan, Yongxian
    Xu, Haibo
    CURRENT BIOINFORMATICS, 2021, 16 (09) : 1169 - 1178
  • [4] Prediction of off-target effects of the CRISPR/Cas9 system for design of sgRNA
    Guo, Calvin
    Zhen, David
    2020 INTERNATIONAL CONFERENCE ON ENERGY, ENVIRONMENT AND BIOENGINEERING (ICEEB 2020), 2020, 185
  • [5] CRISPR-Net: A Recurrent Convolutional Network Quantifies CRISPR Off-Target Activities with Mismatches and Indels
    Lin, Jiecong
    Zhang, Zhaolei
    Zhang, Shixiong
    Chen, Junyi
    Wong, Ka-Chun
    ADVANCED SCIENCE, 2020, 7 (13)
  • [6] Off-target effects in CRISPR/Cas9 gene editing
    Guo, Congting
    Ma, Xiaoteng
    Gao, Fei
    Guo, Yuxuan
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2023, 11
  • [7] Improved HTGTS for CRISPR/Cas9 Off-target Detection
    Yin, Jianhang
    Liu, Mengzhu
    Liu, Yang
    Hu, Jiazhi
    BIO-PROTOCOL, 2019, 9 (09):
  • [8] CRISPR-DIPOFF: an interpretable deep learning approach for CRISPR Cas-9 off-target prediction
    Toufikuzzaman, Md
    Samee, Md Abul Hassan
    Rahman, M. Sohel
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
  • [9] CRISPRon/off: CRISPR/Cas9 on- and off-target gRNA design
    Anthon, Christian
    Corsi, Giulia Ilaria
    Gorodkin, Jan
    BIOINFORMATICS, 2022, 38 (24) : 5437 - 5439
  • [10] Off-target Cas9 crystallized
    Thomas, Tim
    NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2022, 29 (12) : 1147 - 1147