Benchmarking Active Learning Protocols for Ligand-Binding Affinity Prediction

被引:4
|
作者
Gorantla, Rohan [1 ,2 ,3 ]
Kubincova, Alzbeta [3 ]
Suutari, Benjamin [3 ]
Cossins, Benjamin P. [3 ]
Mey, Antonia S. J. S. [2 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[2] Univ Edinburgh, EaStCHEM Sch Chem, Edinburgh EH9 3FJ, Scotland
[3] Exscientia, Oxford OX4 4GE, England
基金
英国科研创新办公室;
关键词
D O I
10.1021/acs.jcim.4c00220
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Active learning (AL) has become a powerful tool in computational drug discovery, enabling the identification of top binders from vast molecular libraries. To design a robust AL protocol, it is important to understand the influence of AL parameters, as well as the features of the data sets on the outcomes. We use four affinity data sets for different targets (TYK2, USP7, D2R, Mpro) to systematically evaluate the performance of machine learning models [Gaussian process (GP) model and Chemprop model], sample selection protocols, and the batch size based on metrics describing the overall predictive power of the model (R2, Spearman rank, root-mean-square error) as well as the accurate identification of top 2%/5% binders (Recall, F1 score). Both models have a comparable Recall of top binders on large data sets, but the GP model surpasses the Chemprop model when training data are sparse. A larger initial batch size, especially on diverse data sets, increased the Recall of both models as well as overall correlation metrics. However, for subsequent cycles, smaller batch sizes of 20 or 30 compounds proved to be desirable. Furthermore, adding artificial Gaussian noise to the data up to a certain threshold still allowed the model to identify clusters with top-scoring compounds. However, excessive noise (<1 sigma) did impact the model's predictive and exploitative capabilities.
引用
收藏
页码:1955 / 1965
页数:11
相关论文
共 50 条
  • [1] DStruBTarget: Integrating Binding Affinity with Structure Similarity for Ligand-Binding Protein Prediction
    Fan, Cong
    Wong, Ping-pui
    Zhao, Huiying
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (01) : 400 - 409
  • [2] A Multi-perspective Model for Protein–Ligand-Binding Affinity Prediction
    Xianfeng Zhang
    Yafei Li
    Jinlan Wang
    Guandong Xu
    Yanhui Gu
    [J]. Interdisciplinary Sciences: Computational Life Sciences, 2023, 15 : 696 - 709
  • [3] Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity
    Heck, Gabriela S.
    Pintro, Val O.
    Pereira, Richard R.
    de Avila, Mauricio B.
    Levin, Nayara M. B.
    de Azevedo, Walter F., Jr.
    [J]. CURRENT MEDICINAL CHEMISTRY, 2017, 24 (23) : 2459 - 2470
  • [4] Combinatorial Effect of Ligand and Ligand-Binding Site Hydrophobicities on Binding Affinity
    Sriramulu, Dinesh Kumar
    Lee, Sun-Gu
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (03) : 1678 - 1684
  • [5] Computational Methods for Calculation of Ligand-Binding Affinity
    de Azevedo, Walter Filgueira, Jr.
    Dias, Raquel
    [J]. CURRENT DRUG TARGETS, 2008, 9 (12) : 1031 - 1039
  • [6] PREDICTING LIGAND-BINDING TO PROTEINS BY AFFINITY FINGERPRINTING
    KAUVAR, LM
    HIGGINS, DL
    VILLAR, HO
    SPORTSMAN, JR
    ENGQVISTGOLDSTEIN, A
    BUKAR, R
    BAUER, KE
    DILLEY, H
    ROCKE, DM
    [J]. CHEMISTRY & BIOLOGY, 1995, 2 (02): : 107 - 118
  • [7] DIFFERENCE METHOD OF LIGAND-BINDING ANALYSIS - DISTINGUISHING SUPERHIGH-AFFINITY OPIATE LIGAND-BINDING SITE
    ZAITSEV, SV
    KUROCHKIN, IN
    VARFOLOMEEV, SD
    BEREZIN, IV
    [J]. DOKLADY AKADEMII NAUK SSSR, 1985, 281 (03): : 727 - 731
  • [8] Prediction of protein-ligand binding affinity with deep learning
    Wang, Yuxiao
    Jiao, Qihong
    Wang, Jingxuan
    Cai, Xiaojun
    Zhao, Wei
    Cui, Xuefeng
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 5796 - 5806
  • [9] Computational prediction of native protein ligand-binding and enzyme active site sequences
    Chakrabarti, R
    Klibanov, AM
    Friesner, RA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (29) : 10153 - 10158
  • [10] Molecular and biological constraints on ligand-binding affinity and specificity
    Szwajkajzer, D
    Carey, J
    [J]. BIOPOLYMERS, 1997, 44 (02) : 181 - 198