A confidence predictor for logD using conformal regression and a support-vector machine

被引:0
|
作者
Maris Lapins
Staffan Arvidsson
Samuel Lampa
Arvid Berg
Wesley Schaal
Jonathan Alvarsson
Ola Spjuth
机构
[1] Uppsala University,Department of Pharmaceutical Biosciences
来源
关键词
Conformal prediction; Machine learning; QSAR; Support-vector machine; LogD; RDF;
D O I
暂无
中图分类号
学科分类号
摘要
Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water–octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of Q2=0.973\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {Q}^{2}=0.973$$\end{document} and with the best performing nonconformity measure having median prediction interval of ±0.39\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pm ~0.39$$\end{document} log units at 80% confidence and ±0.60\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pm ~0.60$$\end{document} log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.[graphic not available: see fulltext]
引用
收藏
相关论文
共 50 条
  • [11] Tide modelling using support vector machine regression
    Okwuashi, Onuwa
    Ndehedehe, Christopher
    JOURNAL OF SPATIAL SCIENCE, 2017, 62 (01) : 29 - 46
  • [12] Estimating stellar atmospheric parameters based on LASSO and support-vector regression
    Lu, Yu
    Li, Xiangru
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2015, 452 (02) : 1394 - 1401
  • [13] Support-Vector Regression for Permeability Prediction in a Heterogeneous Reservoir: A Comparative Study
    Al-Anazi, A.
    Gates, I. D.
    SPE RESERVOIR EVALUATION & ENGINEERING, 2010, 13 (03) : 485 - 495
  • [14] Interval regression analysis using support vector machine and quantile regression
    Hwang, CH
    Hong, DH
    Na, E
    Park, H
    Shim, J
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 100 - 109
  • [15] Liver fat analysis using optimized support vector machine with support vector regression
    Pushpa, B.
    Baskaran, B.
    Vivekanandan, S.
    Gokul, P.
    TECHNOLOGY AND HEALTH CARE, 2023, 31 (03) : 867 - 886
  • [16] Prediction of Bed-Load Sediment Using Newly Developed Support-Vector Machine Techniques
    Samantaray, Sandeep
    Sahoo, Abinash
    Paul, Siddhartha
    Ghose, Dillip K.
    JOURNAL OF IRRIGATION AND DRAINAGE ENGINEERING, 2022, 148 (10)
  • [17] DDCM: A Computational Strategy for Drug Repositioning Based on Support-Vector Regression Algorithm
    Xu, Manyi
    Li, Wan
    He, Jiaheng
    Wang, Yahui
    Lv, Junjie
    He, Weiming
    Chen, Lina
    Zhi, Hui
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (10)
  • [18] BLITE-SVR: New forecasting model for late blight on potato using support-vector regression
    Gu, Y. H.
    Yoo, S. J.
    Park, C. J.
    Kim, Y. H.
    Park, S. K.
    Kim, J. S.
    Lim, J. H.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2016, 130 : 169 - 176
  • [19] The comparative efficiency of algorithms for the construction of support-vector machines for regression reconstruction tasks
    Kadyrova N.O.
    Pavlova L.V.
    Biophysics, 2015, 60 (6) : 900 - 912
  • [20] Dynamic load identification using support vector regression machine
    Yang, Jieming
    Li, Min
    Zhou, Chengzhao
    Zhendong Ceshi Yu Zhenduan/Journal of Vibration, Measurement and Diagnosis, 2006, 26 (SUPPL.): : 258 - 261