A general-purpose protein design framework based on mining sequence-structure relationships in known protein structures

被引:60
|
作者
Zhou, Jianfu [1 ]
Panaitiu, Alexandra E. [1 ]
Grigoryan, Gevorg [1 ,2 ]
机构
[1] Dartmouth Coll, Dept Comp Sci, Hanover, NH 03755 USA
[2] Dartmouth Coll, Dept Biol Sci, Hanover, NH 03755 USA
关键词
protein design; data-driven protein design; structure-based analysis; protein structure; structure search; EFFECTIVE ENERGY FUNCTION; DE-NOVO DESIGN; COMPUTATIONAL DESIGN; ALGORITHM; SEARCH; PREDICTION; INTERFACE; FRAGMENTS; SOFTWARE; REDESIGN;
D O I
10.1073/pnas.1908723117
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Current state-of-the-art approaches to computational protein design (CPD) aim to capture the determinants of structure from physical principles. While this has led to many successful designs, it does have strong limitations associated with inaccuracies in physical modeling, such that a reliable general solution to CPD has yet to be found. Here, we propose a design framework one based on identifying and applying patterns of sequence-structure compatibility found in known proteins, rather than approximating them from models of interatomic interactions. We carry out extensive computational analyses and an experimental validation for our method. Our results strongly argue that the Protein Data Bank is now sufficiently large to enable proteins to be designed by using only examples of structural motifs from unrelated proteins. Because our method is likely to have orthogonal strengths relative to existing techniques, it could represent an important step toward removing remaining barriers to robust CPD.
引用
收藏
页码:1059 / 1068
页数:10
相关论文
共 34 条
  • [1] Capturing protein sequence-structure specificity using computational sequence design
    Mach, Paul
    Koehl, Patrice
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2013, 81 (09) : 1556 - 1570
  • [2] Thermodynamic analysis of protein sequence-structure relationships in monomer and dimer forms
    Li, ZR
    Liu, GR
    Cheng, Y
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2005, 354 : 381 - 392
  • [3] Rapid search for tertiary fragments reveals protein sequence-structure relationships
    Zhou, Jianfu
    Grigoryan, Gevorg
    PROTEIN SCIENCE, 2015, 24 (04) : 508 - 524
  • [4] Deciphering globular protein sequence-structure relationships: from observation to prediction
    Poupon, A
    Mornon, JP
    THEORETICAL CHEMISTRY ACCOUNTS, 2001, 106 (1-2) : 113 - 120
  • [5] Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships
    Giuliani, A
    Benigni, R
    Zbilut, JP
    Webber, CL
    Sirabella, P
    Colosimo, A
    CHEMICAL REVIEWS, 2002, 102 (05) : 1471 - 1491
  • [6] A Tertiary Alphabet for the Observable Protein Structural Universe Captures Sequence-Structure Relationships
    Mackenzie, Craig O.
    Zhou, Jianfu
    Zheng, Fan
    Grigoryan, Gevorg
    PROTEIN SCIENCE, 2016, 25 : 75 - 76
  • [7] Classification tree based protein structure distances for testing sequence-structure correlation
    Zintzaras, Elias
    COMPUTERS IN BIOLOGY AND MEDICINE, 2008, 38 (04) : 469 - 474
  • [8] Toward deep learning sequence-structure co-generation for protein design
    Wang, Chentong
    Alamdari, Sarah
    Domingo-Enrich, Carles
    Amini, Ava P.
    Yang, Kevin K.
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2025, 91
  • [9] Modeling protein loops with knowledge-based prediction of sequence-structure alignment
    Peng, Hung-Pin
    Yang, An-Suei
    BIOINFORMATICS, 2007, 23 (21) : 2836 - 2842
  • [10] A General Framework to Learn Tertiary Structure for Protein Sequence Characterization
    Gao, Mu
    Skolnick, Jeffrey
    FRONTIERS IN BIOINFORMATICS, 2021, 1