SPPPred: Sequence-Based Protein-Peptide Binding Residue Prediction Using Genetic Programming and Ensemble Learning

被引:4
|
作者
Shafiee, Shima [1 ]
Fathi, Abdolhossein [1 ]
Taherzadeh, Ghazaleh [2 ]
机构
[1] Razi Univ, Dept Comp Engn & Informat Technol, Kermanshah 6714414971, Iran
[2] Wilkes Univ, Dept Math & Comp Sci, Wilkes Barre, PA 18766 USA
关键词
Proteins; Feature extraction; Prediction algorithms; Classification algorithms; Support vector machines; Amino acids; Peptides; Binding residue prediction; ensemble learning; genetic programming; protein-peptide interaction; sequence-based; AMINO-ACID; SITES;
D O I
10.1109/TCBB.2022.3230540
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Peptide-binding proteins play significant roles in various applications such as gene expression, metabolism, signal transmission, DNA (Deoxyribose Nucleic Acid) repair, and replication. Investigating the binding residues in protein-peptide complexes, especially from their sequence only, is challenging experimentally and computationally. Although several computational approaches have been introduced to determine and predict these binding residues, there is still ample room to improve the prediction performance. In this work, we introduce a novel ensemble machine learning-based approach called SPPPred (Sequence-based Protein-Peptide binding residue Prediction) to predict protein-peptide binding residues. First, we extract relevant sequential information and employ genetic programming algorithm for feature construction to find more distinctive features. We then, in the next step, build an ensemble-based machine learning classifier to predict binding residues. The proposed method shows consistent and comparable performance on both ten-fold cross-validation and independent test set. Furthermore, SPPPred yields F-Measure (F-M), Accuracy(ACC), and Matthews' Correlation Coefficient (MCC) of 0.310, 0.949, and 0.230 on the independent test set, respectively, which outperforms other competing methods by approximately up to 9% on the independent test set. SPPPred is publicly available https://github.com/GTaherzadeh/SPPPred.git.
引用
收藏
页码:2029 / 2040
页数:12
相关论文
共 50 条
  • [1] Sequence-Based Prediction of Protein-Peptide Binding Sites Using Support Vector Machine
    Taherzadeh, Ghazaleh
    Yang, Yuedong
    Zhang, Tuo
    Liew, Alan Wee-Chung
    Zhou, Yaoqi
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2016, 37 (13) : 1223 - 1229
  • [2] A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction
    Chen, Peng
    Hu, ShanShan
    Zhang, Jun
    Gao, Xin
    Li, Jinyan
    Xia, Junfeng
    Wang, Bing
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 901 - 912
  • [3] SETE: Sequence-based Ensemble learning approach for TCR Epitope binding prediction
    Tong, Yao
    Wang, Jiayin
    Zheng, Tian
    Zhang, Xuanping
    Xiao, Xiao
    Zhu, Xiaoyan
    Lai, Xin
    Liu, Xiang
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2020, 87
  • [4] Protein-peptide binding residue prediction based on protein language models and cross-attention mechanism
    Hu, Jun
    Chen, Kai-Xin
    Rao, Bing
    Ni, Jing-Yuan
    Thafar, Maha A.
    Albaradei, Somayah
    Arif, Muhammad
    [J]. ANALYTICAL BIOCHEMISTRY, 2024, 694
  • [5] PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
    Tomer Hertz
    Chen Yanover
    [J]. BMC Bioinformatics, 7
  • [6] PepDist: A new framework for protein-peptide binding prediction based on learning peptide distance functions
    Hertz, T
    Yanover, C
    [J]. BMC BIOINFORMATICS, 2006, 7 (Suppl 1)
  • [7] Sequence-based bacterial small RNAs prediction using ensemble learning strategies
    Guifeng Tang
    Jingwen Shi
    Wenjian Wu
    Xiang Yue
    Wen Zhang
    [J]. BMC Bioinformatics, 19
  • [8] Sequence-based bacterial small RNAs prediction using ensemble learning strategies
    Tang, Guifeng
    Shi, Jingwen
    Wu, Wenjian
    Yue, Xiang
    Zhang, Wen
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [9] Sequence-based prediction of protein binding mode landscapes
    Horvath, Attila
    Miskei, Marton
    Ambrusl, Viktor
    Vendruscolo, Michele
    Fuxreiter, Monika
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (05)
  • [10] Structure- based prediction of protein-peptide binding regions using Random Forest
    Taherzadeh, Ghazaleh
    Zhou, Yaoqi
    Liew, Alan Wee-Chung
    Yang, Yuedong
    [J]. BIOINFORMATICS, 2018, 34 (03) : 477 - 484