TopModel: Template-Based Protein Structure Prediction at Low Sequence Identity Using Top-Down Consensus and Deep Neural Networks

被引:35
|
作者
Mulnaes, Daniel [1 ]
Porta, Nicola [1 ]
Clemens, Rebecca [6 ]
Apanasenko, Irina [2 ,3 ,7 ]
Reiners, Jens [6 ,8 ]
Gremer, Lothar [2 ,3 ,7 ]
Neudecker, Philipp [2 ,3 ,7 ]
Smits, Sander H. J. [6 ,8 ]
Gohlke, Holger [1 ,2 ,3 ,4 ,5 ]
机构
[1] Heinrich Heine Univ Dusseldorf, Inst Pharmazeut & Med Chem, D-40225 Dusseldorf, Germany
[2] Forschungszentrum Julich, Inst Biol Informat Proc IBI 7 Struct Biochem, D-52425 Julich, Germany
[3] Forschungszentrum Julich, JuStruct, D-52425 Julich, Germany
[4] Forschungszentrum Julich, John von Neumann Inst Comp NIC, D-52425 Julich, Germany
[5] Forschungszentrum Julich, JSC, D-52425 Julich, Germany
[6] Heinrich Heine Univ Dusseldorf, Inst Biochem, D-40225 Dusseldorf, Germany
[7] Heinrich Heine Univ Dusseldorf, Inst Phys Biol, D-40225 Dusseldorf, Germany
[8] Heinrich Heine Univ Dusseldorf, Ctr Struct Studies, D-40225 Dusseldorf, Germany
关键词
MOLECULAR-DYNAMICS SIMULATIONS; FOLD RECOGNITION; SECONDARY STRUCTURE; HOMOLOGY DETECTION; SERVER; ALIGNMENT; ALGORITHM; PROFILES; DESIGN; MODELS;
D O I
10.1021/acs.jctc.9b00825
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Knowledge of protein structures is essential to understand proteins' functions, evolution, dynamics, stabilities, and interactions and for data-driven protein- or drug design. Yet, experimental structure determination rates are far exceeded by that of next-generation sequencing, resulting in less than 1/1000th of proteins having an experimentally known 3D structure. Computational structure prediction seeks to alleviate this problem, and the Critical Assessment of Protein Structure Prediction (CASP) has shown the value of consensus and meta-methods that utilize complementary algorithms. However, traditionally, such methods employ majority voting during template selection and model averaging during refinement, which can drive the model away from the native fold if it is underrepresented in the ensemble. Here, we present TopModel, a fully automated meta-method for protein structure prediction. In contrast to traditional consensus and meta-methods, TopModel uses top-down consensus and deep neural networks to select templates and identify and correct wrongly modeled regions. TopModel combines a broad range of state-of-the-art methods for threading, alignment, and model quality estimation and provides a versatile workflow and toolbox for template-based structure prediction. TopModel shows a superior template selection, alignment accuracy, and model quality for template-based structure prediction on the CASP10- 12 datasets compared to 12 state-of-the-art stand-alone primary predictors. TopModel was validated by prospective predictions of the nisin resistance protein (NSR) protein from Streptococcus agalactiae and LipoP from Clostridium difficile, showing far better agreement with experimental data than any of its constituent primary predictors. These results, in general, demonstrate the utility of TopModel for protein structure prediction and, in particular, show how combining computational structure prediction with sparse or low-resolution experimental data can improve the final model.
引用
收藏
页码:1953 / 1967
页数:15
相关论文
共 33 条