Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol(-1) for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
机构:
Univ Ottawa, Dept Chem & Biomol Sci, Ottawa, ON, CanadaUniv Ottawa, Dept Chem & Biomol Sci, Ottawa, ON, Canada
Josephson, Jason D.
论文数: 引用数:
h-index:
机构:
Pezacki, John Paul
Nakajima, Masaya
论文数: 0引用数: 0
h-index: 0
机构:
Chiba Univ, Grad Sch Pharmaceut Sci, Chiba, Japan
Univ Tokyo, Grad Sch Pharmaceut Sci, Tokyo, JapanUniv Ottawa, Dept Chem & Biomol Sci, Ottawa, ON, Canada
机构:
Univ Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Jumin, Ellysia
Zaini, Nuratiah
论文数: 0引用数: 0
h-index: 0
机构:
Univ Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Zaini, Nuratiah
Ahmed, Ali Najah
论文数: 0引用数: 0
h-index: 0
机构:
Univ Tenaga Nas, Inst Energy Infrastruct IEI, Selangor Darul Ehsan, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Ahmed, Ali Najah
Abdullah, Samsuri
论文数: 0引用数: 0
h-index: 0
机构:
Univ Malaysia Terengganu, Fac Ocean Engn Technol & Informat, Air Qual & Environm Res Grp, Terengganu, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Abdullah, Samsuri
Ismail, Marzuki
论文数: 0引用数: 0
h-index: 0
机构:
Univ Malaysia Terengganu, Fac Sci & Marine Environm, Terengganu, Malaysia
Univ Malaysia Terengganu, Inst Trop Biodivers & Sustainable Dev, Terengganu, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Ismail, Marzuki
Sherif, Mohsen
论文数: 0引用数: 0
h-index: 0
机构:
United Arab Emirates Univ, Natl Water Ctr NWC, Al Ain, U Arab Emirates
United Arab Emirates Univ, Coll Engn, Civil & Environm Engn Dept, Al Ain, U Arab EmiratesUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Sherif, Mohsen
Sefelnasr, Ahmed
论文数: 0引用数: 0
h-index: 0
机构:
United Arab Emirates Univ, Natl Water Ctr NWC, Al Ain, U Arab EmiratesUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
Sefelnasr, Ahmed
EI-Shafie, Ahmed
论文数: 0引用数: 0
h-index: 0
机构:
United Arab Emirates Univ, Natl Water Ctr NWC, Al Ain, U Arab Emirates
Univ Malaya, Fac Engn, Dept Civil Engn, Kuala Lumpur, MalaysiaUniv Tenaga Nas, Coll Engn, Dept Civil Engn, Selangor Darul Ehsan, Malaysia
机构:
Northwestern Univ, Dept Mech Engn, Evanston, IL 60208 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Tajdari, Mahsa
Pawar, Aishwarya
论文数: 0引用数: 0
h-index: 0
机构:
Carnegie Mellon Univ, Dept Mech Engn, Pittsburgh, PA 15213 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Pawar, Aishwarya
Li, Hengyang
论文数: 0引用数: 0
h-index: 0
机构:
Northwestern Univ, Dept Mech Engn, Evanston, IL 60208 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Li, Hengyang
论文数: 引用数:
h-index:
机构:
Tajdari, Farzam
Maqsood, Ayesha
论文数: 0引用数: 0
h-index: 0
机构:
Ann & Robert H Lurie Childrens Hosp, Dept Surg, Div Orthopaed Surg & Sports Med, Chicago, IL 60611 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Maqsood, Ayesha
Cleary, Emmett
论文数: 0引用数: 0
h-index: 0
机构:
Univ Southern Calif, Keck Sch Med, Los Angeles, CA 90089 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Cleary, Emmett
Saha, Sourav
论文数: 0引用数: 0
h-index: 0
机构:
Northwestern Univ, Theoret & Appl Mech, Evanston, IL 60208 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Saha, Sourav
Zhang, Yongjie Jessica
论文数: 0引用数: 0
h-index: 0
机构:
Carnegie Mellon Univ, Dept Mech Engn, Pittsburgh, PA 15213 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
Zhang, Yongjie Jessica
Sarwark, John F.
论文数: 0引用数: 0
h-index: 0
机构:
Ann & Robert H Lurie Childrens Hosp, Dept Surg, Div Orthopaed Surg & Sports Med, Chicago, IL 60611 USA
Northwestern Univ, Feinberg Sch Med, Evanston, IL 60208 USANorthwestern Univ, Dept Mech Engn, Evanston, IL 60208 USA
机构:
Tech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
Univ Calif Los Angeles, Inst Pure & Appl Math, Los Angeles, CA 90095 USATech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
Rupp, Matthias
Tkatchenko, Alexandre
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Los Angeles, Inst Pure & Appl Math, Los Angeles, CA 90095 USA
Max Planck Gesell, Fritz Haber Inst, D-14195 Berlin, GermanyTech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
Tkatchenko, Alexandre
Mueller, Klaus-Robert
论文数: 0引用数: 0
h-index: 0
机构:
Tech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
Univ Calif Los Angeles, Inst Pure & Appl Math, Los Angeles, CA 90095 USATech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
Mueller, Klaus-Robert
von Lilienfeld, O. Anatole
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Los Angeles, Inst Pure & Appl Math, Los Angeles, CA 90095 USA
Argonne Natl Lab, Argonne Leadership Comp Facil, Argonne, IL 60439 USATech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
机构:
Department of Microbiology, Federal University of Viçosa, Minas Gerais, ViçosaDepartment of Microbiology, Federal University of Viçosa, Minas Gerais, Viçosa
Moura Ferreira M.A.D.
Wendering P.
论文数: 0引用数: 0
h-index: 0
机构:
Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam
Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, PotsdamDepartment of Microbiology, Federal University of Viçosa, Minas Gerais, Viçosa
Wendering P.
Arend M.
论文数: 0引用数: 0
h-index: 0
机构:
Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam
Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, PotsdamDepartment of Microbiology, Federal University of Viçosa, Minas Gerais, Viçosa
Arend M.
Batista da Silveira W.
论文数: 0引用数: 0
h-index: 0
机构:
Department of Microbiology, Federal University of Viçosa, Minas Gerais, ViçosaDepartment of Microbiology, Federal University of Viçosa, Minas Gerais, Viçosa
Batista da Silveira W.
Nikoloski Z.
论文数: 0引用数: 0
h-index: 0
机构:
Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam
Systems Biology and Mathematical Modelling, Max Planck Institute of Molecular Plant Physiology, PotsdamDepartment of Microbiology, Federal University of Viçosa, Minas Gerais, Viçosa