共 50 条
Sequence determinants of protein phase separation and recognition by protein phase-separated condensates through molecular dynamics and active learning
被引:2
|作者:
Changiarath, Arya
[1
]
Arya, Aayush
[1
]
Xenidis, Vasileios A.
[2
]
Padeken, Jan
[3
]
Stelzl, Lukas S.
[3
,4
,5
]
机构:
[1] Johannes Gutenberg Univ JGU Mainz, Inst Phys, Mainz, Germany
[2] Aristotle Univ Thessaloniki, Dept Biol, Thessaloniki, Greece
[3] Inst Mol Biol IMB Mainz, Mainz, Germany
[4] Johannes Gutenberg Univ JGU Mainz, Inst Mol Physiol, Mainz, Germany
[5] Johannes Gutenberg Univ JGU Mainz, Inst Phys, KOMET1, Mainz, Germany
关键词:
LANGUAGE;
D O I:
10.1039/d4fd00099d
中图分类号:
O64 [物理化学(理论化学)、化学物理学];
学科分类号:
070304 ;
081704 ;
摘要:
Elucidating how protein sequence determines the properties of disordered proteins and their phase-separated condensates is a great challenge in computational chemistry, biology, and biophysics. Quantitative molecular dynamics simulations and derived free energy values can in principle capture how a sequence encodes the chemical and biological properties of a protein. These calculations are, however, computationally demanding, even after reducing the representation by coarse-graining; exploring the large spaces of potentially relevant sequences remains a formidable task. We employ an "active learning" scheme introduced by Yang et al. (bioRxiv, 2022, https://doi.org/10.1101/2022.08.05.502972) to reduce the number of labelled examples needed from simulations, where a neural network-based model suggests the most useful examples for the next training cycle. Applying this Bayesian optimisation framework, we determine properties of protein sequences with coarse-grained molecular dynamics, which enables the network to establish sequence-property relationships for disordered proteins and their self-interactions and their interactions in phase-separated condensates. We show how iterative training with second virial coefficients derived from the simulations of disordered protein sequences leads to a rapid improvement in predicting peptide self-interactions. We employ this Bayesian approach to efficiently search for new sequences that bind to condensates of the disordered C-terminal domain (CTD) of RNA Polymerase II, by simulating molecular recognition of peptides to phase-separated condensates in coarse-grained molecular dynamics. By searching for protein sequences which prefer to self-interact rather than interact with another protein sequence we are able to shape the morphology of protein condensates and design multiphasic protein condensates.
引用
收藏
页码:235 / 254
页数:20
相关论文