Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery

被引:0
|
作者
Xue, Nian [1 ]
Zhang, Yuzhu [2 ]
Liu, Sensen [3 ]
机构
[1] NYU, Dept Comp Sc & Engn, New York, NY USA
[2] Carnegie Mellon Univ, Sch Comp Sc, Pittsburgh, PA 15213 USA
[3] Washington Univ, Dept Elect & Syst Engn, St Louis, MO 63110 USA
关键词
Machine Learning; Solubility Prediction; Drug Discovery; Feature Importance; DESCRIPTORS; QSAR;
D O I
10.1109/ICAIBD62003.2024.10604556
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Determining the aqueous solubility of the chemical compound is of great importance in-silico drug discovery. However, correctly and rapidly predicting the aqueous solubility remains a challenging task. This paper explores and evaluates the predictability of multiple machine learning models in the aqueous solubility of compounds. Specifically, we apply a series of machine learning algorithms, including Random Forest, XG-Boost, LightGBM, and CatBoost, on a well-established aqueous solubility dataset (i.e., the Huuskonen dataset) of over 1200 compounds. Experimental results show that even traditional machine learning algorithms can achieve satisfactory performance with high accuracy. In addition, our investigation goes beyond mere prediction accuracy, delving into the interpretability of models to identify key features and understand the molecular properties that influence the predicted outcomes. This study sheds light on the ability to use machine learning approaches to predict compound solubility, significantly shortening the time that researchers spend on new drug discovery.
引用
收藏
页码:26 / 33
页数:8
相关论文
共 50 条
  • [41] An evaluation of thermodynamic models for the prediction of drug and drug-like molecule solubility in organic solvents
    Bouillot, Baptiste
    Teychene, Sebastien
    Biscans, Beatrice
    FLUID PHASE EQUILIBRIA, 2011, 309 (01) : 36 - 52
  • [42] Machine learning model for prediction of drug solubility in supercritical solvent: Modeling and experimental validation
    An, Feifei
    Sayed, Biju Theruvil
    Parra, Rosario Mireya Romero
    Hamad, Mohammed Haider
    Sivaraman, R.
    Foumani, Zahra Zanjani
    Rushchitc, Anastasia Andreevna
    El-Maghawry, Enas
    Alzhrani, Rami M.
    Alshehri, Sameer
    AboRas, Kareem M.
    JOURNAL OF MOLECULAR LIQUIDS, 2022, 363
  • [43] Machine learning in the evaluation and prediction models of biochar application: A review
    Chen, Meng-Wei
    Chang, Meng-Shiuh
    Mao, Yuehua
    Hu, Shuyin
    Kung, Chih-Chun
    SCIENCE PROGRESS, 2023, 106 (01)
  • [44] Development of quantitative structure-property relationship models for early ADME evaluation in drug discovery. 1. Aqueous solubility
    Liu, RF
    So, SS
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (06): : 1633 - 1639
  • [45] Machine learning models in the prediction of drug metabolism: challenges and future perspectives
    Litsa, Eleni E.
    Das, Payel
    Kavraki, Lydia E.
    EXPERT OPINION ON DRUG METABOLISM & TOXICOLOGY, 2021, 17 (11) : 1245 - 1247
  • [46] Prediction of CO2 solubility in aqueous amine solutions using machine learning method
    Liu, Bin
    Yu, Yanan
    Liu, Zijian
    Cui, Zhe
    Tian, Wende
    SEPARATION AND PURIFICATION TECHNOLOGY, 2025, 354
  • [47] Support Vector Machine Prediction of Drug Solubility on GPUs
    Cano, Gaspar
    Garcia-Rodriguez, Jose
    Orts-Escolano, Sergio
    Pena-Garcia, Jorge
    Kumar-Yadav, Dharmendra
    Perez-Garrido, Alfonso
    Perez-Sanchez, Horacio
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2015), PT II, 2015, 9044 : 645 - 654
  • [48] Data-driven machine learning models for the prediction of hydrogen solubility in aqueous systems of varying salinity: Implications for underground hydrogen storage
    Thanh, Hung Vo
    Zhang, Hemeng
    Dai, Zhenxue
    Zhang, Tao
    Tangparitkul, Suparit
    Min, Baehyun
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2024, 55 : 1422 - 1433
  • [49] Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models
    Tayyebi, Arash
    Alshami, Ali S.
    Rabiei, Zeinab
    Yu, Xue
    Ismail, Nadhem
    Talukder, Musabbir Jahan
    Power, Jason
    JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [50] Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models
    Arash Tayyebi
    Ali S Alshami
    Zeinab Rabiei
    Xue Yu
    Nadhem Ismail
    Musabbir Jahan Talukder
    Jason Power
    Journal of Cheminformatics, 15