A random forest classifier for protein-protein docking models

被引:5
|
作者
Barradas-Bautista, Didier [1 ]
Cao, Zhen [1 ]
Vangone, Anna [2 ]
Oliva, Romina [3 ]
Cavallo, Luigi [1 ]
Gromiha, Michael
机构
[1] King Abdullah Univ Sci & Technol KAUST, Kaust Catalysis Ctr, Phys Sci & Engn Div, Thuwal 239556900, Saudi Arabia
[2] Roche Innovat Ctr Munich Large Mol Res, Pharm Res & Early Dev, Therapeut Modal, D-82377 Penzberg, Germany
[3] Univ Parthenope Naples, Ctr Direzionale Isola C4, Dept Sci & Technol, I-80143 Naples, Italy
来源
BIOINFORMATICS ADVANCES | 2022年 / 2卷 / 01期
关键词
INTER-RESIDUE CONTACTS; PREDICTION; COMPLEXES; FEATURES; RANKING; ELECTROSTATICS; CONSERVATION; REFINEMENT; POTENTIALS; AFFINITY;
D O I
10.1093/bioadv/vbab042
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3x104 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of approximate to 7x106 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions.Supplementary information are available at Bioinformatics Advances online.Software and data availability statement The docking models are available at https://doi.org/10.5281/zenodo.4012018. The programs underlying this article will be shared on request to the corresponding authors.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Automatic prediction of flexible regions improves the accuracy of protein-protein docking models
    Luo, Xiaohu
    Lu, Qiang
    Wu, Hongjie
    Yang, Lingyun
    Huang, Xu
    Qian, Peide
    Fu, Gang
    JOURNAL OF MOLECULAR MODELING, 2012, 18 (05) : 2199 - 2208
  • [32] Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models
    Chermak, Edrisse
    De Donato, Renato
    Lensink, Marc F.
    Petta, Andrea
    Serra, Luigi
    Scarano, Vittorio
    Cavallo, Luigi
    Oliva, Romina
    PLOS ONE, 2016, 11 (11):
  • [33] Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS
    Li, Bi-Qing
    Feng, Kai-Yan
    Chen, Lei
    Huang, Tao
    Cai, Yu-Dong
    PLOS ONE, 2012, 7 (08):
  • [34] Random forest similarity for protein-protein interaction prediction from multiple sources
    Qi, YJ
    Klein-Seetharaman, J
    Bar-Joseph, Z
    PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, 2005, : 531 - 542
  • [35] Protein-protein interaction site prediction using random forest proximity distance
    Qiu, Zhijun
    Liu, Qingjie
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2021, 19 (01)
  • [36] Protein-Protein Docking with Improved Shape Complementarity
    Yan, Yumeng
    Huang, Sheng-You
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT I, 2018, 10954 : 600 - 605
  • [37] Protein-protein docking benchmark version 4.0
    Hwang, Howook
    Vreven, Thom
    Janin, Joel
    Weng, Zhiping
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2010, 78 (15) : 3111 - 3114
  • [38] Multi stage approach to protein-protein docking
    Kozakov, Dima
    Hall, David
    Beglov, Dmitriy
    Brenke, Ryan
    Vajda, Sandor
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2011, 242
  • [39] An Ensemble Classifier with Random Projection for Predicting Protein-Protein Interactions Using Sequence and Evolutionary Information
    Song, Xiao-Yu
    Chen, Zhan-Heng
    Sun, Xiang-Yang
    You, Zhu-Hong
    Li, Li-Ping
    Zhao, Yang
    APPLIED SCIENCES-BASEL, 2018, 8 (01):
  • [40] High-resolution protein-protein docking
    Gray, JJ
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2006, 16 (02) : 183 - 193