Efficient algorithms for protein sequence design and the analysis of certain evolutionary fitness landscapes

被引:5
|
作者
Kleinberg, JM [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
inverse protein folding; protein sequence design; network flow algorithms; combinatorial optimization; evolutionary fitness landscapes;
D O I
10.1089/106652799318346
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein sequence design is a natural inverse problem to protein structure prediction: given a target structure in three dimensions, we wish to design an amino acid sequence that is likely fold to it. A model of Sun, Brem, Chan, and Dill casts this problem as an optimization on a space of sequences of hydrophobic (H) and polar (P) monomers; the goal is to find a sequence that achieves a dense hydrophobic core with few solvent-exposed hydrophobic residues. Sun et nl. developed a heuristic method to search the space of sequences, without a guarantee of optimality or near-optimality; Hart subsequently raised the computational tractability of constructing an optimal sequence in this model as an open question. Here we resolve this question by providing an efficient algorithm to construct optimal sequences; our algorithm has a polynomial running time, and performs very efficiently in practice. We illustrate the implementation of our method on structures drawn from the Protein Data Bank. We also consider extensions of the model to larger amino acid alphabets, as a way to overcome the limitations of the binary H/P alphabet. We show that for a natural class of arbitrarily large alphabets, it remains possible to design optimal sequences efficiently. Finally, we analyze some of the consequences of this sequence design model for the study of evolutionary fitness landscapes. A given target structure may have many sequences that are optimal in the model of Sun et al.; following a notion raised by the work of J. Maynard Smith, we can ask whether these optimal sequences are "connected" by successive point mutations. We provide a polynomial-time algorithm to decide this connectedness property, relative to a given target structure. We develop the algorithm by first solving an analogous problem expressed in terms of submodular functions, a fundamental object of study in combinatorial optimization.
引用
收藏
页码:387 / 404
页数:18
相关论文
共 50 条
  • [1] Fitness landscapes and evolutionary algorithms
    Reeves, CR
    [J]. ARTIFICIAL EVOLUTION, 2000, 1829 : 3 - 20
  • [2] Co-Evolutionary Fitness Landscapes for Sequence Design
    Tian, Pengfei
    Louis, John M.
    Baber, James L.
    Aniana, Annie
    Best, Robert B.
    [J]. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2018, 57 (20) : 5674 - 5678
  • [3] Evolutionary Algorithms with Clustering for Dynamic Fitness Landscapes
    Aragon, Victoria
    Esquivel, Susana
    [J]. JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2005, 5 (04): : 196 - 203
  • [4] Behavior of evolutionary algorithms in chaotically changing fitness landscapes
    Richter, H
    [J]. PARALLEL PROBLEM SOLVING FROM NATURE - PPSN VIII, 2004, 3242 : 111 - 120
  • [5] Comparison of fitness landscapes for evolutionary design of dipole antennas
    Alander, JT
    Zinchenko, LA
    Sorokin, SN
    [J]. IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2004, 52 (11) : 2932 - 2940
  • [6] Evolutionary mechanisms studied through protein fitness landscapes
    Canale, Aneth S.
    Cote-Hammarlof, Pamela A.
    Flynn, Julia M.
    Bolon, Daniel N. A.
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2018, 48 : 141 - 148
  • [7] Fitness landscapes analysis and adaptive algorithms design for traffic lights optimization on SIALAC benchmark
    Lepretre, Florian
    Fonlupt, Cyril
    Verel, Sebastien
    Marion, Virginie
    Armas, Rolando
    Aguirre, Hernan
    Tanaka, Kiyoshi
    [J]. APPLIED SOFT COMPUTING, 2019, 85
  • [8] On the performance of evolutionary algorithms with life-time adaptation in dynamic fitness landscapes
    Eriksson, R
    Olsson, B
    [J]. CEC2004: PROCEEDINGS OF THE 2004 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2004, : 1293 - 1300
  • [9] Direct Calculation of Protein Fitness Landscapes through Computational Protein Design
    Au, Loretta
    Green, David F.
    [J]. BIOPHYSICAL JOURNAL, 2016, 110 (01) : 75 - 84
  • [10] Efficient search for robust solutions by means of evolutionary algorithms and fitness approximation
    Paenke, Ingo
    Branke, Juergen
    Jin, Yaochu
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2006, 10 (04) : 405 - 420