Gaussian Process-Based Refinement of Dispersion Corrections

被引:40
|
作者
Proppe, Jonny [1 ,2 ,3 ]
Gugler, Stefan [3 ]
Reiher, Markus [3 ]
机构
[1] Univ Toronto, Dept Chem, Toronto, ON M5S, Canada
[2] Univ Toronto, Dept Comp Sci, Toronto, ON M5S, Canada
[3] Swiss Fed Inst Technol, Lab Phys Chem, Vladimir Prelog Weg 2, CH-8093 Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
DENSITY-FUNCTIONAL THEORY; BASIS-SET CONVERGENCE; PREDICTION UNCERTAINTY; RARE-GAS; ENERGIES; ACCURATE; APPROXIMATIONS; POTENTIALS; DATABASE; VALENCE;
D O I
10.1021/acs.jctc.9b00627
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
We employ Gaussian process (GP) regression to adjust for systematic errors in D3-type dispersion corrections. We refer to the associated, statistically improved model as D3-GP. It is trained on differences between interaction energies obtained from PBE-D3(BJ)/ma-def2-QZVPP and DLPNO- CCSD(T)/CBS calculations. We generated a data set containing interaction energies for 1248 molecular dimers, which resemble the dispersion-dominated systems contained in the S66 data set. Our systems represent not only equilibrium structures but also dimers with various relative orientations and conformations at both shorter and longer distances. A reparametrization of the D3(BJ) model based on 66 of these dimers suggests that two of its three empirical parameters, a(1), and s(g), are zero, whereas a(2) = 5.6841 bohr. For the remaining 1182 dimers, we find that this new set of parameters is superior to all previously published D3(BJ) parameter sets. To train our D3-GP model, we engineered two different vectorial representations of (supra-)molecular systems, both derived from the matrix of atom-pairwise D3(BJ) interaction terms: (a) a distance-resolved interaction energy histogram, histD3(BJ), and (b) eigenvalues of the interaction matrix ordered according to their decreasing absolute value, eigD3(BJ). Hence, the GP learns a mapping from D3(BJ) information only, which renders D3-GP-type dispersion corrections comparable to those obtained with the original D3 approach. They improve systematically if the underlying training set is selected carefully. Here, we harness the prediction variance obtained from GP regression to select optimal training sets in an automated fashion. The larger the variance, the more information the corresponding data point may add to the training set. For a given set of molecular systems, variance-based sampling can approximately determine the smallest subset being subjected to reference calculations such that all dispersion corrections for the remaining systems fall below a predefined accuracy threshold. To render the entire D3-GP workflow as efficient as possible, we present an improvement over our variance-based, sequential active-learning scheme [J. Chem. Theory Comput. 2018, 14, 5238]. Our refined learning algorithm selects multiple (instead of single) systems that can be subjected to reference calculations simultaneously. We refer to the underlying selection strategy as batchwise variance-based sampling (BVS). BVS-guided active learning is an essential component of our D3-GP workflow, which is implemented in a black-box fashion. Once provided with reference data for new molecular systems, the underlying GP model automatically learns to adapt to these and similar systems. This approach leads overall to a self-improving model (D3-GP) that predicts system-focused and GP-refined D3-type dispersion corrections for any given system of reference data.
引用
收藏
页码:6046 / 6060
页数:15
相关论文
共 50 条
  • [1] Gaussian Process-Based Inferential Control System
    Abusnina, Ali
    Kudenko, Daniel
    Roth, Rolf
    [J]. INTERNATIONAL JOINT CONFERENCE SOCO'14-CISIS'14-ICEUTE'14, 2014, 299 : 115 - 124
  • [2] Optimization Employing Gaussian Process-Based Surrogates
    Preuss, R.
    von Toussaint, U.
    [J]. BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, MAXENT 37, 2018, 239 : 275 - 284
  • [3] The construct validity and refinement of process-based policing measures
    Reisig, Michael D.
    Bratton, Jason
    Gertz, Marc G.
    [J]. CRIMINAL JUSTICE AND BEHAVIOR, 2007, 34 (08) : 1005 - 1028
  • [4] A Gaussian Process-Based Ground Segmentation for Sloped Terrains
    Mehrabi, Pouria
    Taghirad, Hamid D.
    [J]. 2021 9TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2021, : 371 - 377
  • [5] An analysis of covariance parameters in Gaussian process-based optimization
    Mohammadi, Hossein
    Le Riche, Rodolphe
    Bay, Xavier
    Touboul, Eric
    [J]. CROATIAN OPERATIONAL RESEARCH REVIEW, 2018, 9 (01) : 1 - 10
  • [6] Gaussian Process-Based Personalized Adaptive Cruise Control
    Wang, Yanbing
    Wang, Ziran
    Han, Kyungtae
    Tiwari, Prashant
    Work, Daniel B.
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 21178 - 21189
  • [7] Gaussian Process-based Spatio-Temporal Predictor
    Varga, Balazs
    [J]. ACTA POLYTECHNICA HUNGARICA, 2022, 19 (05) : 69 - 84
  • [8] Gaussian process-based algorithmic trading strategy identification
    Yang, Steve Y.
    Qiao, Qifeng
    Beling, Peter A.
    Scherer, William T.
    Kirilenko, Andrei A.
    [J]. QUANTITATIVE FINANCE, 2015, 15 (10) : 1683 - 1703
  • [9] Automated Negotiation with Gaussian Process-based Utility Models
    Leahu, Haralambie
    Kaisers, Michael
    Baarslag, Tim
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 421 - 427
  • [10] TENSOR COMPLETION VIA GAUSSIAN PROCESS-BASED INITIALIZATION
    Kapushev, Yermek
    Oseledets, Ivan
    Burnaev, Evgeny
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2020, 42 (06): : A3812 - A3824