ParaAntiProt provides paratope prediction using antibody and protein language models

被引：0

作者：

Kalemati, Mahmood ^{[1
]}

Noroozi, Alireza ^{[1
]}

Shahbakhsh, Aref ^{[1
]}

Koohi, Somayyeh ^{[1
]}

机构：

[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran

来源：

SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期

关键词：

Paratope prediction; Antibody Language models; Protein Language models; Complementarity determining regions; Deep learning;

D O I：

10.1038/s41598-024-80940-y

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Efficiently predicting the paratope holds immense potential for enhancing antibody design, treating cancers and other serious diseases, and advancing personalized medicine. Although traditional methods are highly accurate, they are often time-consuming, labor-intensive, and reliant on 3D structures, restricting their broader use. On the other hand, machine learning-based methods, besides relying on structural data, entail descriptor computation, consideration of diverse physicochemical properties, and feature engineering. Here, we develop a deep learning-assisted prediction method for paratope identification, relying solely on amino acid sequences and being antigen-agnostic. Built on the ProtTrans architecture, and utilizing pre-trained protein and antibody language models, we extract efficient embeddings for predicting paratope. By incorporating positional encoding for Complementarity Determining Regions, our model gains a deeper structural understanding, achieving remarkable performance with a 0.904 ROC AUC, 0.701 F1-score, and 0.585 MCC on benchmark datasets. In addition to yielding accurate antibody paratope predictions, our method exhibits strong performance in predicting nanobody paratope, achieving a ROC AUC of 0.912 and a PR AUC of 0.665 on the nanobody dataset. Notably, our approach outperforms structure-based prediction methods, boasting a PR AUC of 0.731. Various conducted ablation studies, which elaborate on the impact of each part of the model on the prediction task, show that the improvement in prediction performance by applying CDR positional encoding together with CNNs depends on the specific protein and antibody language models used. These results highlight the potential of our method to advance disease understanding and aid in the discovery of new diagnostics and antibody therapies.

引用

页数：15

共 50 条

[31] Redefining antibody patent protection using paratope mapping and CDR-scanning
Banik, Soma S. R.
Deng, Xiaoxiang
Davidson, Edgar
Storz, Ulrich
Doranz, Benjamin J.
NATURE BIOTECHNOLOGY, 2025, 43 (02) : 170 - 174
[32] BepiPred-3.0: Improved B-cell epitope prediction using protein language models
Clifford, Joakim Noddeskov
Hoie, Magnus Haraldson
Deleuran, Sebastian
Peters, Bjoern
Nielsen, Morten
Marcatili, Paolo
PROTEIN SCIENCE, 2022, 31 (12)
[33] Accurate prediction of antibody function and structure using bio-inspired antibody language model
Jing, Hongtai
Gao, Zhengtao
Xu, Sheng
Shen, Tao
Peng, Zhangzhi
He, Shwai
You, Tao
Ye, Shuang
Lin, Wei
Sun, Siqi
BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
[34] Clinical risk prediction using language models: benefits and considerations
Acharya, Angeela
Shrestha, Sulabh
Chen, Anyi
Conte, Joseph
Avramovic, Sanja
Sikdar, Siddhartha
Anastasopoulos, Antonios
Das, Sanmay
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024,
[35] Prediction of Arabic Legal Rulings Using Large Language Models
Ammar, Adel
Koubaa, Anis
Benjdira, Bilel
Nacar, Omer
Sibaee, Serry
ELECTRONICS, 2024, 13 (04)
[36] University Student Dropout Prediction Using Pretrained Language Models
Won, Hyun-Sik
Kim, Min-Ji
Kim, Dohyun
Kim, Hee-Soo
Kim, Kang-Min
APPLIED SCIENCES-BASEL, 2023, 13 (12):
[37] Linguistics-based formalization of the antibody language as a basis for antibody language models
Vu, Mai Ha
Robert, Philippe A.
Akbar, Rahmad
Swiatczak, Bartlomiej
Sandve, Geir Kjetil
Haug, Dag Trygve Truslew
Greiff, Victor
NATURE COMPUTATIONAL SCIENCE, 2024, 4 (06): : 412 - 422
[38] Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review
Chen, Jia-Ying
Wang, Jing-Fu
Hu, Yue
Li, Xin-Hui
Qian, Yu-Rong
Song, Chao-Lin
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2025, 13
[39] Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning
Xu, Shijie
Onoda, Akira
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 64 (07) : 2901 - 2911
[40] Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning
Xu, Shijie
Onoda, Akira
Journal of Chemical Information and Modeling, 2024, 64 (07) : 2901 - 2911

← 1 2 3 4 5 →