Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies

被引:132
|
作者
Jorner, Kjell [1 ]
Brinck, Tore [2 ]
Norrby, Per-Ola [3 ]
Buttar, David [1 ]
机构
[1] AstraZeneca, Early Chem Dev, Pharmaceut Sci, R&D, Macclesfield, Cheshire, England
[2] KTH Royal Inst Technol, Dept Chem, Appl Phys Chem, CBH, Stockholm, Sweden
[3] AstraZeneca, Data Sci & Modelling, Pharmaceut Sci, R&D, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
NUCLEOPHILIC-SUBSTITUTION; ELECTROSTATIC POTENTIALS; REACTIVITY; REGIOSELECTIVITY; CLASSIFICATION; EFFICIENT;
D O I
10.1039/d0sc04896h
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol(-1) for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
引用
收藏
页码:1163 / 1175
页数:13
相关论文
共 50 条
  • [31] Combining machine learning and numerical modelling for rockburst prediction
    Papadopoulos, Dimitrios
    Benardos, Andreas
    GEOMECHANICS AND GEOENGINEERING-AN INTERNATIONAL JOURNAL, 2024, 19 (02): : 183 - 198
  • [32] An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU
    Nemati, Shamim
    Holder, Andre
    Razmi, Fereshteh
    Stanley, Matthew D.
    Clifford, Gari D.
    Buchman, Timothy G.
    CRITICAL CARE MEDICINE, 2018, 46 (04) : 547 - 553
  • [33] Accurate Prediction of Financial Distress of Companies with Machine Learning Algorithms
    Vieira, Armando S.
    Duarte, Joao
    Ribeiro, Bernardete
    Neves, Joao C.
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 569 - +
  • [34] Machine Learning and Scaling Laws for Prediction of Accurate Adsorption Energy
    Nayak, Sanjay
    Bhattacharjee, Satadeep
    Choi, Jung-Hae
    Lee, Seung Cheol
    JOURNAL OF PHYSICAL CHEMISTRY A, 2020, 124 (01): : 247 - 254
  • [35] Accurate band gap prediction based on an interpretable ?-machine learning
    Zhang, Lingyao
    Su, Tianhao
    Li, Musen
    Jia, Fanhao
    Hu, Shuobo
    Zhang, Peihong
    Ren, Wei
    MATERIALS TODAY COMMUNICATIONS, 2022, 33
  • [36] Accurate Performance and Power Prediction for FPGAs Using Machine Learning
    Sawalha, Lina
    Abuaita, Tawfiq
    Cowley, Martin
    Akhmatdinov, Sergei
    Dubs, Adam
    2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022), 2022, : 228 - 228
  • [37] Accurate prediction of essential proteins using ensemble machine learning
    鲁德志
    吴淏
    侯俞彤
    吴云成
    刘媛媛
    王金武
    Chinese Physics B, 2025, 34 (01) : 112 - 119
  • [38] Advanced Machine Learning Approaches for Accurate Migraine Prediction and Classification
    Baccouch, Chokri
    Bahar, Chaima
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) : 1 - 11
  • [39] Accurate prediction of Snare Protein Sequence using Machine Learning
    Talpur, Dani Bux
    Shaikh, Salahuddin
    Khowaja, Ashfaque
    Adnan, Saifullah
    Ghulam, Ali
    BIOSCIENCE RESEARCH, 2022, 19 (03): : 1414 - 1422
  • [40] A Machine Learning Model for Accurate Prediction of Sepsis in ICU Patients
    Wang, Dong
    Li, Jinbo
    Sun, Yali
    Ding, Xianfei
    Zhang, Xiaojuan
    Liu, Shaohua
    Han, Bing
    Wang, Haixu
    Duan, Xiaoguang
    Sun, Tongwen
    FRONTIERS IN PUBLIC HEALTH, 2021, 9