Direct generation of protein conformational ensembles via machine learning

被引:57
|
作者
Janson, Giacomo [1 ]
Valdes-Garcia, Gilberto [1 ]
Heo, Lim [1 ]
Feig, Michael [1 ]
机构
[1] Michigan State Univ, Dept Biochem & Mol Biol, E Lansing, MI 48824 USA
基金
美国国家卫生研究院;
关键词
MOLECULAR-DYNAMICS; BIOMOLECULAR SIMULATION; DISORDERED PROTEINS; ENZYME;
D O I
10.1038/s41467-023-36443-x
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Computational methods to study protein structural dynamics are a powerful tool in life sciences but are computationally expensive. Here, the authors show that machine learning can be used to efficiently generate protein conformational ensembles and test their method on intrinsically disordered peptides. Dynamics and conformational sampling are essential for linking protein structure to biological function. While challenging to probe experimentally, computer simulations are widely used to describe protein dynamics, but at significant computational costs that continue to limit the systems that can be studied. Here, we demonstrate that machine learning can be trained with simulation data to directly generate physically realistic conformational ensembles of proteins without the need for any sampling and at negligible computational cost. As a proof-of-principle we train a generative adversarial network based on a transformer architecture with self-attention on coarse-grained simulations of intrinsically disordered peptides. The resulting model, idpGAN, can predict sequence-dependent coarse-grained ensembles for sequences that are not present in the training set demonstrating that transferability can be achieved beyond the limited training data. We also retrain idpGAN on atomistic simulation data to show that the approach can be extended in principle to higher-resolution conformational ensemble generation.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Generation of conformational ensembles of small molecules via surrogate model-assisted molecular dynamics
    Diez, Juan Viguera
    Atance, Sara Romeo
    Engkvist, Ola
    Olsson, Simon
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (02):
  • [22] Understanding Protein Dynamics Using Conformational Ensembles
    Salvatella, X.
    PROTEIN CONFORMATIONAL DYNAMICS, 2014, 805 : 67 - 85
  • [23] Data-Efficient Generation of Protein Conformational Ensembles with Backbone-to-Side-Chain Transformers
    Chennakesavalu, Shriram
    Rotskoff, Grant M.
    JOURNAL OF PHYSICAL CHEMISTRY B, 2024, 128 (09): : 2114 - 2123
  • [24] Unbound conformational ensembles improve protein-protein docking
    Fernandez-Recio, Juan
    Pallara, Chiara
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [25] Molecular machine learning with conformer ensembles
    Axelrod, Simon
    Gomez-Bombarelli, Rafael
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (03):
  • [26] Searching for Fairer Machine Learning Ensembles
    Feffer, Michael
    Hirzel, Martin
    Hoffman, Samuel C.
    Kate, Kiran
    Ram, Parikshit
    Shinnar, Avraham
    INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING, VOL 224, 2023, 224
  • [27] Procedural Content Generation via Machine Learning (PCGML)
    Summerville, Adam
    Snodgrass, Sam
    Guzdial, Matthew
    Holmgard, Christoffer
    Hoover, Amy K.
    Isaksen, Aaron
    Nealen, Andy
    Togelius, Julian
    IEEE TRANSACTIONS ON GAMES, 2018, 10 (03) : 257 - 270
  • [28] GENERATION OF DIAGNOSTIC RULES VIA INDUCTIVE MACHINE LEARNING
    CIOS, KJ
    LIU, N
    GOODENDAY, LS
    KYBERNETES, 1993, 22 (05) : 44 - 56
  • [29] Direct Correlation of Cell Toxicity to Conformational Ensembles of Genetic Aβ Variants
    Somavarapu, Arun Kumar
    Kepp, Kasper P.
    ACS CHEMICAL NEUROSCIENCE, 2015, 6 (12): : 1990 - 1996
  • [30] A Retrospective on the Development of Methods for the Analysis of Protein Conformational Ensembles
    Hayward, Steven
    PROTEIN JOURNAL, 2023, 42 (03): : 181 - 191