ParaAntiProt provides paratope prediction using antibody and protein language models

被引:0
|
作者
Kalemati, Mahmood [1 ]
Noroozi, Alireza [1 ]
Shahbakhsh, Aref [1 ]
Koohi, Somayyeh [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
Paratope prediction; Antibody Language models; Protein Language models; Complementarity determining regions; Deep learning;
D O I
10.1038/s41598-024-80940-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Efficiently predicting the paratope holds immense potential for enhancing antibody design, treating cancers and other serious diseases, and advancing personalized medicine. Although traditional methods are highly accurate, they are often time-consuming, labor-intensive, and reliant on 3D structures, restricting their broader use. On the other hand, machine learning-based methods, besides relying on structural data, entail descriptor computation, consideration of diverse physicochemical properties, and feature engineering. Here, we develop a deep learning-assisted prediction method for paratope identification, relying solely on amino acid sequences and being antigen-agnostic. Built on the ProtTrans architecture, and utilizing pre-trained protein and antibody language models, we extract efficient embeddings for predicting paratope. By incorporating positional encoding for Complementarity Determining Regions, our model gains a deeper structural understanding, achieving remarkable performance with a 0.904 ROC AUC, 0.701 F1-score, and 0.585 MCC on benchmark datasets. In addition to yielding accurate antibody paratope predictions, our method exhibits strong performance in predicting nanobody paratope, achieving a ROC AUC of 0.912 and a PR AUC of 0.665 on the nanobody dataset. Notably, our approach outperforms structure-based prediction methods, boasting a PR AUC of 0.731. Various conducted ablation studies, which elaborate on the impact of each part of the model on the prediction task, show that the improvement in prediction performance by applying CDR positional encoding together with CNNs depends on the specific protein and antibody language models used. These results highlight the potential of our method to advance disease understanding and aid in the discovery of new diagnostics and antibody therapies.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Structure-free antibody paratope similarity prediction for in silico epitope binning via protein language models
    Ghanbarpour, Ahmadreza
    Jiang, Min
    Foster, Denisa
    Chai, Qing
    ISCIENCE, 2023, 26 (02)
  • [2] Parapred: antibody paratope prediction using convolutional and recurrent neural networks
    Liberis, Edgar
    Velickovic, Petar
    Sormanni, Pietro
    Vendruscolo, Michele
    Lio, Pietro
    BIOINFORMATICS, 2018, 34 (17) : 2944 - 2950
  • [3] Advancing variant effect prediction using protein language models
    Benjamin J. Livesey
    Joseph A. Marsh
    Nature Genetics, 2023, 55 : 1426 - 1427
  • [4] Advancing variant effect prediction using protein language models
    Livesey, Benjamin J.
    Marsh, Joseph A.
    NATURE GENETICS, 2023, 55 (09) : 1426 - 1427
  • [5] Paragraph-antibody paratope prediction using graph neural networks with minimal feature vectors
    Chinery, Lewis
    Wahome, Newton
    Moal, Iain
    Deane, Charlotte M.
    BIOINFORMATICS, 2023, 39 (01)
  • [6] TemStaPro: protein thermostability prediction using sequence representations from protein language models
    Pudziuvelyte, Ieva
    Olechnovic, Kliment
    Godliauskaite, Egle
    Sermokas, Kristupas
    Urbaitis, Tomas
    Gasiunas, Giedrius
    Kazlauskas, Darius
    BIOINFORMATICS, 2024, 40 (04)
  • [7] Using protein language models for protein interaction hot spot prediction with limited data
    Karen Sargsyan
    Carmay Lim
    BMC Bioinformatics, 25
  • [8] Using protein language models for protein interaction hot spot prediction with limited data
    Sargsyan, Karen
    Lim, Carmay
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [9] Protein language models guide directed antibody evolution
    Singh, Arunima
    NATURE METHODS, 2023, 20 (06) : 785 - 785
  • [10] Protein language models guide directed antibody evolution
    Arunima Singh
    Nature Methods, 2023, 20 : 785 - 785