Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14

被引:42
|
作者
Zheng, Wei [1 ]
Li, Yang [1 ,2 ]
Zhang, Chengxin [1 ]
Zhou, Xiaogen [1 ]
Pearce, Robin [1 ]
Bell, Eric W. [1 ]
Huang, Xiaoqiang [1 ]
Zhang, Yang [1 ,3 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
ab initio folding; CASP14; deep learning; domain partition; multiple sequence alignment; protein structure prediction; residue-residue distance prediction; FOLD-RECOGNITION; I-TASSER; SIMILARITY; SERVER;
D O I
10.1002/prot.26193
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this article, we report 3D structure prediction results by two of our best server groups ("Zhang-Server" and "QUARK") in CASP14. These two servers were built based on the D-I-TASSER and D-QUARK algorithms, which integrated four newly developed components into the classical protein folding pipelines, I-TASSER and QUARK, respectively. The new components include: (a) a new multiple sequence alignment (MSA) collection tool, DeepMSA2, which is extended from the DeepMSA program; (b) a contact-based domain boundary prediction algorithm, FUpred, to detect protein domain boundaries; (c) a residual convolutional neural network-based method, DeepPotential, to predict multiple spatial restraints by co-evolutionary features derived from the MSA; and (d) optimized spatial restraint energy potentials to guide the structure assembly simulations. For 37 FM targets, the average TM-scores of the first models produced by D-I-TASSER and D-QUARK were 96% and 112% higher than those constructed by I-TASSER and QUARK, respectively. The data analysis indicates noticeable improvements produced by each of the four new components, especially for the newly added spatial restraints from DeepPotential and the well-tuned force field that combines spatial restraints, threading templates, and generic knowledge-based potentials. However, challenges still exist in the current pipelines. These include difficulties in modeling multi-domain proteins due to low accuracy in inter-domain distance prediction and modeling protein domains from oligomer complexes, as the co-evolutionary analysis cannot distinguish inter-chain and intra-chain distances. Specifically tuning the deep learning-based predictors for multi-domain targets and protein complexes may be helpful to address these issues.
引用
收藏
页码:1734 / 1751
页数:18
相关论文
共 50 条
  • [1] Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14
    Liu, Jian
    Wu, Tianqi
    Guo, Zhiye
    Hou, Jie
    Cheng, Jianlin
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2022, 90 (01) : 58 - 72
  • [2] Improving deep learning-based protein distance prediction in CASP14
    Guo, Zhiye
    Wu, Tianqi
    Liu, Jian
    Hou, Jie
    Cheng, Jianlin
    BIOINFORMATICS, 2021, 37 (19) : 3190 - 3196
  • [3] Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14
    Anishchenko, Ivan
    Baek, Minkyung
    Park, Hahnbeom
    Hiranuma, Naozumi
    Kim, David E.
    Dauparas, Justas
    Mansoor, Sanaa
    Humphreys, Ian R.
    Baker, David
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1722 - 1733
  • [4] Protein oligomer structure prediction using GALAXY in CASP14
    Park, Taeyong
    Woo, Hyeonuk
    Yang, Jinsol
    Kwon, Sohee
    Won, Jonghun
    Seok, Chaok
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1844 - 1851
  • [5] Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14
    Chen, Xiao
    Liu, Jian
    Guo, Zhiye
    Wu, Tianqi
    Hou, Jie
    Cheng, Jianlin
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [6] Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14
    Xiao Chen
    Jian Liu
    Zhiye Guo
    Tianqi Wu
    Jie Hou
    Jianlin Cheng
    Scientific Reports, 11
  • [7] High-accuracy protein structure prediction in CASP14
    Pereira, Joana
    Simpkin, Adam J.
    Hartmann, Marcus D.
    Rigden, Daniel J.
    Keegan, Ronan M.
    Lupas, Andrei N.
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1687 - 1699
  • [8] Evaluation of Deep Neural Network ProSPr for Accurate Protein Distance Predictions on CASP14 Targets
    Stern, Jacob
    Hedelius, Bryce
    Fisher, Olivia
    Billings, Wendy M.
    Della Corte, Dennis
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (23)
  • [9] Protein inter-residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14
    Li, Yang
    Zhang, Chengxin
    Zheng, Wei
    Zhou, Xiaogen
    Bell, Eric W.
    Yu, Dong-Jun
    Zhang, Yang
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1911 - 1921
  • [10] Target classification in the 14th round of the critical assessment of protein structure prediction (CASP14)
    Kinch, Lisa N.
    Schaeffer, R. Dustin
    Kryshtafovych, Andriy
    Grishin, Nick V.
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (12) : 1618 - 1632