Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引:18
|
作者
Zhang, Jian [1 ]
Ghadermarzi, Sina [2 ]
Kurgan, Lukasz [2 ]
机构
[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China
[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;
D O I
10.1093/bioinformatics/btaa573
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.
引用
收藏
页码:4729 / 4738
页数:10
相关论文
共 50 条
  • [41] A Sequence-Based Prediction Model of Vesicular Transport Proteins Using Ensemble Deep Learning
    Le, Nguyen Quoc Khanh
    Kha, Quang Hien
    14TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, BCB 2023, 2023,
  • [42] B-factor prediction in proteins using a sequence-based deep learning model
    Pandey, Akash
    Liu, Elaine
    Graham, Jacob
    Chen, Wei
    Keten, Sinan
    PATTERNS, 2023, 4 (09):
  • [43] A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction
    Chen, Peng
    Hu, ShanShan
    Zhang, Jun
    Gao, Xin
    Li, Jinyan
    Xia, Junfeng
    Wang, Bing
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 901 - 912
  • [44] Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection
    Ma, Xin
    Guo, Jing
    Sun, Xiao
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [45] Sequence-based recognition of protein folds using the threading method and frameworks of globular proteins
    Rykunov, DS
    Lobanov, MY
    Finkelstein, AV
    MOLECULAR BIOLOGY, 1998, 32 (03) : 428 - 438
  • [46] The s2D Method: Simultaneous Sequence-Based Prediction of the Statistical Populations of Ordered and Disordered Regions in Proteins
    Sormanni, Pietro
    Camilloni, Carlo
    Fariselli, Piero
    Vendruscolo, Michele
    JOURNAL OF MOLECULAR BIOLOGY, 2015, 427 (04) : 982 - 996
  • [47] Prediction of microRNA-binding residues in protein using a Laplacian support vector machine based on sequence information
    Ma, Xin
    Guo, Jing
    Sun, Xiao
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2018, 16 (03)
  • [48] DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning
    Zhang, Fuhao
    Zhao, Bi
    Shi, Wenbo
    Li, Min
    Kurgan, Lukasz
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [49] Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
    Park, Yungki
    BMC BIOINFORMATICS, 2009, 10 : 419
  • [50] CoMemMoRFPred: Sequence-based Prediction of MemMoRFs by Combining Predictors of Intrinsic Disorder, MoRFs and Disordered Lipid-binding Regions
    Basu, Sushmita
    Hegedus, Tamas
    Kurgan, Lukasz
    JOURNAL OF MOLECULAR BIOLOGY, 2023, 435 (21)