Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3x104 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of approximate to 7x106 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions.Supplementary information are available at Bioinformatics Advances online.Software and data availability statement The docking models are available at https://doi.org/10.5281/zenodo.4012018. The programs underlying this article will be shared on request to the corresponding authors.
机构:
Univ Wisconsin, Dept Biochem, Madison, WI 53706 USAUniv Wisconsin, Dept Biochem, Madison, WI 53706 USA
Alsop, James D.
Mitchell, Julie C.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Biochem, Madison, WI 53706 USA
Univ Wisconsin, Dept Math, Madison, WI 53706 USAUniv Wisconsin, Dept Biochem, Madison, WI 53706 USA
机构:
Tel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, IsraelTel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, Israel
Andrusier, Nelly
Mashiach, Efrat
论文数: 0引用数: 0
h-index: 0
机构:
Tel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, IsraelTel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, Israel
Mashiach, Efrat
Nussinov, Ruth
论文数: 0引用数: 0
h-index: 0
机构:
SAIC Frederick Inc, Basic Res Program, Ctr Canc Res Nanobiol Program NCI Frederick, Frederick, MD 21702 USA
Tel Aviv Univ, Sackler Fac Med, Dept Human Genet & Mol Med, IL-69978 Tel Aviv, IsraelTel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, Israel
Nussinov, Ruth
Wolfson, Haim J.
论文数: 0引用数: 0
h-index: 0
机构:
Tel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, IsraelTel Aviv Univ, Sch Comp Sci, Raymond & Beverly Sackler Fac Exact Sci, IL-69978 Tel Aviv, Israel
机构:
Univ Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USAUniv Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USA
Wang, Chu
Bradley, Philip
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USAUniv Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USA
Bradley, Philip
Baker, David
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USAUniv Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USA