SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction

被引:47
|
作者
Francoeur, Paul G. [1 ]
Koes, David R. [1 ]
机构
[1] Univ Pittsburgh, Dept Computat & Syst Biol, Pittsburgh, PA 15260 USA
关键词
FREE-ENERGIES;
D O I
10.1021/acs.jcim.1c00331
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
While accurate prediction of aqueous solubility remains a challenge in drug discovery, machine learning (ML) approaches have become increasingly popular for this task. For instance, in the Second Challenge to Predict Aqueous Solubility (SC2), all groups utilized machine learning methods in their submissions. We present SolTranNet, a molecule attention transformer to predict aqueous solubility from a molecule's SMILES representation. Atypically, we demonstrate that larger models perform worse at this task, with SolTranNet's final architecture having 3,393 parameters while outperforming linear ML approaches. SolTranNet has a 3-fold scaffold split cross-validation root-mean-square error (RMSE) of 1.459 on AqSolDB and an RMSE of 1.711 on a withheld test set. We also demonstrate that, when used as a classifier to filter out insoluble compounds, SolTranNet achieves a sensitivity of 94.8% on the SC2 data set and is competitive with the other methods submitted to the competition. SolTranNet is distributed via PIP, and its source code is available at https://github.com/gnina/SolTranNet.
引用
收藏
页码:2530 / 2536
页数:7
相关论文
共 50 条
  • [41] Hydrogen solubility in aromatic/cyclic compounds: Prediction by different machine learning techniques
    Jiang, Yongchun
    Zhang, Guangfen
    Wang, Juanjuan
    Vaferi, Behzad
    [J]. INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2021, 46 (46) : 23591 - 23602
  • [42] Salicylic acid solubility prediction in different solvents based on machine learning algorithms
    Hashemi, Seyed Hossein
    Besharati, Zahra
    Hashemi, Seyed Abdolrasoul
    [J]. DIGITAL CHEMICAL ENGINEERING, 2024, 11
  • [43] Accurate prediction of aqueous solubility.
    McBrien, M
    DeWitte, RS
    Kolovanov, E
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2002, 223 : U352 - U352
  • [44] Prediction of aqueous solubility from SCRATCH
    Jain, Parijat
    Yalkowsky, Samuel H.
    [J]. INTERNATIONAL JOURNAL OF PHARMACEUTICS, 2010, 385 (1-2) : 1 - 5
  • [45] Recent Advances on Aqueous Solubility Prediction
    Wang, Junmei
    Hou, Tingjun
    [J]. COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2011, 14 (05) : 328 - 338
  • [46] Estimating Aqueous Solubility Directly From Molecular Structure Using Machine Learning Approach
    Dutta, Anurag
    Karmakar, Rahul
    [J]. PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 467 - 473
  • [47] Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models
    Arash Tayyebi
    Ali S Alshami
    Zeinab Rabiei
    Xue Yu
    Nadhem Ismail
    Musabbir Jahan Talukder
    Jason Power
    [J]. Journal of Cheminformatics, 15
  • [48] Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models
    Tayyebi, Arash
    Alshami, Ali S.
    Rabiei, Zeinab
    Yu, Xue
    Ismail, Nadhem
    Talukder, Musabbir Jahan
    Power, Jason
    [J]. JOURNAL OF CHEMINFORMATICS, 2023, 15 (01)
  • [49] Data-driven machine learning models for the prediction of hydrogen solubility in aqueous systems of varying salinity: Implications for underground hydrogen storage
    Thanh, Hung Vo
    Zhang, Hemeng
    Dai, Zhenxue
    Zhang, Tao
    Tangparitkul, Suparit
    Min, Baehyun
    [J]. INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2024, 55 : 1422 - 1433
  • [50] Fast Prediction of Process Variation Band through Machine Learning Models
    Kareem, Pervaiz
    Kwon, Yonghwi
    Cho, Gangmin
    Shin, Youngsoo
    [J]. OPTICAL MICROLITHOGRAPHY XXXIV, 2021, 11613