Explainable Supervised Machine Learning Model To Predict Solvation Gibbs Energy

被引:7
|
作者
Ferraz-Caetano, Jose [2 ]
Teixeira, Filipe [1 ]
Cordeiro, M. Natalia D. S. [2 ]
机构
[1] Univ Minho, Ctr Chem, Campus Gualtar, P-4710057 Braga, Portugal
[2] Univ Porto, Fac Sci, Dept Chem & Biochem, Rua Campo Alegre, P-4169007 Porto, Portugal
关键词
ORGANIC-MOLECULES; DISCOVERY; SHELL;
D O I
10.1021/acs.jcim.3c00544
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Many challenges persist in developing accurate computationalmodelsfor predicting solvation free energy (& UDelta;G (sol)). Despite recent developments in Machine Learning (ML)methodologies that outperformed traditional quantum mechanical models,several issues remain concerning explanatory insights for broad chemicalpredictions with an acceptable speed-accuracy trade-off. Toovercome this, we present a novel supervised ML model to predict the & UDelta;G (sol) for an array of solvent-solutepairs. Using two different ensemble regressor algorithms, we madefast and accurate property predictions using open-source chemicalfeatures, encoding complex electronic, structural, and surface areadescriptors for every solvent and solute. By integrating molecularproperties and chemical interaction features, we have analyzed individualdescriptor importance and optimized our model though explanatory informationform feature groups. On aqueous and organic solvent databases, MLmodels revealed the predictive relevance of solutes with increasingpolar surface area and decreasing polarizability, yielding betterresults than state-of-the-art benchmark Neural Network methods (withoutcomplex quantum mechanical or molecular dynamic simulations). Bothalgorithms successfully outperformed previous & UDelta;G (sol) predictions methods, with a maximum absolute errorof 0.22 & PLUSMN; 0.02 kcal mol(-1), further validatedin an external benchmark database and with solvent hold-out tests.With these explanatory and statistical insights, they allow a thoughtfulapplication of this method for predicting other thermodynamic properties,stressing the relevance of ML modeling for further complex computationalchemistry problems.
引用
收藏
页码:2250 / 2262
页数:13
相关论文
共 50 条
  • [1] Explainable machine learning model to predict refeeding hypophosphatemia
    Choi, Tae Yang
    Chang, Min-Yung
    Heo, Sungtaik
    Jang, Ji Young
    [J]. CLINICAL NUTRITION ESPEN, 2021, 45 : 213 - 219
  • [2] An Explainable Framework to Predict Child Sexual Abuse Awareness in People Using Supervised Machine Learning Models
    Chadaga K.
    Prabhu S.
    Sampathila N.
    Chadaga R.
    Bairy M.
    Swathi K.S.
    [J]. Journal of Technology in Behavioral Science, 2024, 9 (2) : 346 - 362
  • [3] Explainable Machine Learning Model to Accurately Predict Protein-Binding Peptides
    Azim, Sayed Mehedi
    Balasubramanyam, Aravind
    Islam, Sheikh Rabiul
    Fu, Jinglin
    Dehzangi, Iman
    [J]. Algorithms, 2024, 17 (09)
  • [4] Using Machine Learning to Predict Enthalpy of Solvation
    Brandon J. Jaquis
    Ailin Li
    Nolan D. Monnier
    Robert G. Sisk
    William E. Acree
    Andrew S. I. D. Lang
    [J]. Journal of Solution Chemistry, 2019, 48 : 564 - 573
  • [5] Using Machine Learning to Predict Enthalpy of Solvation
    Jaquis, Brandon J.
    Li, Ailin
    Monnier, Nolan D.
    Sisk, Robert G.
    Acree, William E., Jr.
    Lang, Andrew S. I. D.
    [J]. JOURNAL OF SOLUTION CHEMISTRY, 2019, 48 (04) : 564 - 573
  • [6] Development of a machine-learning model to predict Gibbs free energy of binding for protein-ligand complexes
    Bitencourt-Ferreira, Gabriela
    de Azevedo, Walter Filgueira
    [J]. BIOPHYSICAL CHEMISTRY, 2018, 240 : 63 - 69
  • [7] Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy
    Chung, Yunsie
    Vermeire, Florence H.
    Wu, Haoyang
    Walker, Pierre J.
    Abraham, Michael H.
    Green, William H.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (03) : 433 - 446
  • [8] Explainable machine learning framework to predict personalized physiological aging
    Bernard, David
    Doumard, Emmanuel
    Ader, Isabelle
    Kemoun, Philippe
    Pages, Jean-Christophe
    Galinier, Anne
    Cussat-Blanc, Sylvain
    Furger, Felix
    Ferrucci, Luigi
    Aligon, Julien
    Delpierre, Cyrille
    Penicaud, Luc
    Monsarrat, Paul
    Casteilla, Louis
    [J]. AGING CELL, 2023, 22 (08)
  • [9] Explainable machine learning methods to predict postpartum depression risk
    Shivaprasad, Susmita
    Chadaga, Krishnaraj
    Sampathila, Niranjana
    Prabhu, Srikanth
    Chadaga P, Rajagopala
    K S, Swathi
    [J]. Systems Science and Control Engineering, 2024, 12 (01):
  • [10] Explainable and transparency machine learning approach to predict diabetes develop
    Francesco Curia
    [J]. Health and Technology, 2023, 13 (5) : 769 - 780