Assessing the Accuracy and Efficiency of Free Energy Differences Obtained from Reweighted Flow-Based Probabilistic Generative Models

被引:0
|
作者
Olehnovics, Edgar [1 ,2 ]
Liu, Yifei Michelle [3 ]
Mehio, Nada [4 ]
Sheikh, Ahmad Y. [4 ]
Shirts, Michael R. [5 ]
Salvalaglio, Matteo [1 ,2 ]
机构
[1] UCL, Thomas Young Ctr, London WC1E 7JE, England
[2] UCL, Dept Chem Engn, London WC1E 7JE, England
[3] AbbVie Biores Ctr, Mol Profiling & Drug Delivery Res & Dev, Worcester, MA 01605 USA
[4] AbbVie Inc, Mol Profiling & Drug Delivery Res & Dev, N Chicago, IL 60064 USA
[5] Univ Colorado, Boulder, CO 80309 USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1021/acs.jctc.4c00520
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Computing free energy differences between metastable states characterized by nonoverlapping Boltzmann distributions is often a computationally intensive endeavor, usually requiring chains of intermediate states to connect them. Targeted free energy perturbation (TFEP) can significantly lower the computational cost of FEP calculations by choosing a set of invertible maps used to directly connect the distributions of interest, achieving the necessary statistically significant overlaps without sampling any intermediate states. Probabilistic generative models (PGMs) based on normalizing flow architectures can make it much easier via machine learning to train invertible maps needed for TFEP. However, the accuracy and applicability of approaches based on empirically learned maps depend crucially on the choice of reweighting method adopted to estimate the free energy differences. In this work, we assess the accuracy, rate of convergence, and data efficiency of different free energy estimators, including exponential averaging, Bennett acceptance ratio (BAR), and multistate Bennett acceptance ratio (MBAR), in reweighting PGMs trained by maximum likelihood on limited amounts of molecular dynamics data sampled only from end-states of interest. We carry out the comparisons on a set of simple but representative case studies, including conformational ensembles of alanine dipeptide and ibuprofen. Our results indicate that BAR and MBAR are both data efficient and robust, even in the presence of significant model overfitting in the generation of invertible maps. This analysis can serve as a stepping stone for the deployment of efficient and quantitatively accurate ML-based free energy calculation methods in complex systems.
引用
收藏
页码:5913 / 5922
页数:10
相关论文
empty
未找到相关数据