Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引:18
|
作者
Zhang, Jian [1 ]
Ghadermarzi, Sina [2 ]
Kurgan, Lukasz [2 ]
机构
[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China
[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;
D O I
10.1093/bioinformatics/btaa573
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.
引用
收藏
页码:4729 / 4738
页数:10
相关论文
共 50 条
  • [31] Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes
    Agarwal, Ankita
    Kant, Shri
    Bahadur, Ranjit Prasad
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2023, 91 (09) : 1361 - 1379
  • [32] Sequence-based discrimination of protein-RNA interacting residues using a probabilistic approach
    Pai, Priyadarshini P.
    Dash, Tirtharaj
    Mondal, Sukanta
    JOURNAL OF THEORETICAL BIOLOGY, 2017, 418 : 77 - 83
  • [33] Sequence-based prediction of protein protein interaction using a deep-learning algorithm
    Sun, Tanlin
    Zhou, Bo
    Lai, Luhua
    Pei, Jianfeng
    BMC BIOINFORMATICS, 2017, 18
  • [34] Sequence-based prediction of protein protein interaction using a deep-learning algorithm
    Tanlin Sun
    Bo Zhou
    Luhua Lai
    Jianfeng Pei
    BMC Bioinformatics, 18
  • [35] FuzPred: a web server for the sequence-based prediction of the context-dependent binding modes of proteins
    Hatos, Andras
    Teixeira, Joao M. C.
    Barrera-Vilarmau, Susana
    Horvath, Attila
    Tosatto, Silvio C. E.
    Vendruscolo, Michele
    Fuxreiter, Monika
    NUCLEIC ACIDS RESEARCH, 2023, 51 (W1) : W198 - W206
  • [36] SPPPred: Sequence-Based Protein-Peptide Binding Residue Prediction Using Genetic Programming and Ensemble Learning
    Shafiee, Shima
    Fathi, Abdolhossein
    Taherzadeh, Ghazaleh
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 2029 - 2040
  • [37] Sequence-based prediction of pH-dependent protein solubility using CamSol
    Oeller, Marc
    Kang, Ryan
    Bell, Rosie
    Ausserwoger, Hannes
    Sormanni, Pietro
    Vendruscolo, Michele
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [38] PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine
    Rawi, Reda
    Mall, Raghvendra
    Kunji, Khalid
    Shen, Chen-Hsiang
    Kwong, Peter D.
    Chuang, Gwo-Yu
    BIOINFORMATICS, 2018, 34 (07) : 1092 - 1098
  • [39] Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information
    Ahmad, S
    Gromiha, MM
    Sarai, A
    BIOINFORMATICS, 2004, 20 (04) : 477 - 486
  • [40] Improving Protein Structure Prediction Using Multiple Sequence-Based Contact Predictions
    Wu, Sitao
    Szilagyi, Andras
    Zhang, Yang
    STRUCTURE, 2011, 19 (08) : 1182 - 1191