Protein-peptide binding residue prediction based on protein language models and cross-attention mechanism

被引:4
|
作者
Hu, Jun [1 ,2 ]
Chen, Kai-Xin [1 ]
Rao, Bing [3 ]
Ni, Jing-Yuan [4 ]
Thafar, Maha A. [5 ]
Albaradei, Somayah [6 ]
Arif, Muhammad [7 ]
机构
[1] Zhejiang Univ Technol, Coll Informat Engn, Hangzhou 310023, Peoples R China
[2] Suzhou Inst Syst Med, Ctr AI & Computat Biol, Suzhou 215123, Peoples R China
[3] Hangzhou City Univ, Sch Informat & Elect Engn, Hangzhou 310015, Peoples R China
[4] Nanjing Univ Informat Sci & Technol, NUIST Reading Acad, Nanjing 210044, Peoples R China
[5] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, Taif 21944, Saudi Arabia
[6] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah, Saudi Arabia
[7] Hamad Bin Khalifa Univ, Coll Sci & Engn, Doha 34110, Qatar
基金
中国国家自然科学基金;
关键词
Peptide-binding residue prediction; Protein language model; Sequence-based methods; Cross-attention mechanism; SEQUENCE-BASED PREDICTION; WEB SERVER; REGIONS; SITES;
D O I
10.1016/j.ab.2024.115637
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Accurate identifications of protein-peptide binding residues are essential for protein-peptide interactions and advancing drug discovery. To address this problem, extensive research efforts have been made to design more discriminative feature representations. However, extracting these explicit features usually depend on third-party tools, resulting in low computational efficacy and suffering from low predictive performance. In this study, we design an end-to-end deep learning-based method, E2EPep, for protein-peptide binding residue prediction using protein sequence only. E2EPep first employs and fine-tunes two state-of-the-art pre-trained protein language models that can extract two different high-latent feature representations from protein sequences relevant for protein structures and functions. A novel feature fusion module is then designed in E2EPep to fuse and optimize the above two feature representations of binding residues. In addition, we have also design E2EPep+, which integrates E2EPep and PepBCL models, to improve the prediction performance. Experimental results on two independent testing data sets demonstrate that E2EPep and E2EPep + could achieve the average AUC values of 0.846 and 0.842 while achieving an average Matthew's correlation coefficient value that is significantly higher than that of existing most of sequence-based methods and comparable to that of the state-of-the-art structurebased predictors. Detailed data analysis shows that the primary strength of E2EPep lies in the effectiveness of feature representation using cross-attention mechanism to fuse the embeddings generated by two fine-tuned protein language models. The standalone package of E2EPep and E2EPep + can be obtained at https://github. com/ckx259/E2EPep.git for academic use only.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Computational Prediction of Protein-Peptide Binding
    Antes, Iris
    Glaser, Manuel
    Patronov, Atanas
    BIOPHYSICAL JOURNAL, 2014, 106 (02) : 647A - 647A
  • [2] CAPLA: improved prediction of protein-ligand binding affinity by a deep learning approach based on a cross-attention mechanism
    Jin, Zhi
    Wu, Tingfang
    Chen, Taoning
    Pan, Deng
    Wang, Xuejiao
    Xie, Jingxin
    Quan, Lijun
    Lyu, Qiang
    BIOINFORMATICS, 2023, 39 (02)
  • [3] UMPPI: Unveiling Multilevel Protein-Peptide Interaction Prediction via Language Models
    Xiong, Shuwen
    Cai, Jiajie
    Shi, Hua
    Cui, Feifei
    Zhang, Zilong
    Wei, Leyi
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2025,
  • [4] Geometry based General Prediction Model of Protein-Peptide Binding Affinities
    Liu, Zhonghao
    Hu, Jianjun
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1590 - 1597
  • [5] SPPPred: Sequence-Based Protein-Peptide Binding Residue Prediction Using Genetic Programming and Ensemble Learning
    Shafiee, Shima
    Fathi, Abdolhossein
    Taherzadeh, Ghazaleh
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 2029 - 2040
  • [6] Recent advances in structure-based prediction of protein-peptide binding affinities
    Beuming, Thijs
    Li, Hubert
    Feyfant, Eric
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [7] Study of Data Imbalanced Problem in Protein-peptide Binding Prediction
    Gao, Lu
    Siu, Shirley W. I.
    PROCEEDINGS OF 2020 12TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL TECHNOLOGY, ICBBT 2020, 2020, : 61 - 66
  • [8] PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
    Tomer Hertz
    Chen Yanover
    BMC Bioinformatics, 7
  • [9] Advances in the Prediction of Protein-Peptide Binding Affinities: Implications for Peptide-Based Drug Discovery
    Audie, Joseph
    Swanson, Jon
    CHEMICAL BIOLOGY & DRUG DESIGN, 2013, 81 (01) : 50 - 60
  • [10] PepDist: A new framework for protein-peptide binding prediction based on learning peptide distance functions
    Hertz, T
    Yanover, C
    BMC BIOINFORMATICS, 2006, 7 (Suppl 1)