Machine-Learning Model for Predicting the Rate Constant of Protein-Ligand Dissociation

被引:2
|
作者
Su, Minyi [1 ,2 ]
Liu, Huisi [3 ]
Lin, Haixia [3 ]
Wang, Renxiao [1 ,2 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Organ Chem, State Key Lab Bioorgan & Nat Prod Chem, Shanghai 200032, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Shanghai Univ, Coll Sci, Dept Chem, Shanghai 200444, Peoples R China
基金
中国国家自然科学基金;
关键词
Dissociation rate constant; Ligand binding kinetics; Random forest model; Protein-ligand interaction; Structure-based drug design; TARGET RESIDENCE TIME; MOLECULAR-DYNAMICS; ACCURATE DOCKING; CD-HIT; BINDING; GLIDE; COMPLEXES; KINETICS; SETS; LEAD;
D O I
10.3866/PKU.WHXB201907006
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
An increasing number of recent studies have shown that the binding kinetics of a drug molecule to its target correlates strongly with its efficacy in vivo. Therefore, ligand optimization oriented to improved binding kinetics provides new ideas for rational drug design. Currently, ligand binding kinetics is modeled mainly through extensive molecular dynamics simulations, which limits its application to real-world problems. The present study aimed at obtaining a general-purpose quantitative structure-kinetics relationship (QSKR) model for predicting the dissociation rate constant (k(off)) of a ligand based on its complex structure. This type of model is expected to be suitable for high-throughput tasks in structure-based drug design. We collected the experimentally measured koff values for 406 ligand molecules from literature, and then constructed a three-dimensional structural model for each protein-ligand complex through molecular modeling. A training set was compiled using 60% of those complexes while the remaining 40% were assigned to two test sets. Based on distance-dependent protein-ligand atom pair descriptors, a random forest algorithm was adopted to derive a QSKR model. Various random forest models were then generated based on the descriptor sets obtained under different conditions, such as distance cutoff, bin width, and feature selection criteria. The cross-validation results of those models were then examined. It was observed that the optimal model was obtained when the distance cutoff was 15 angstrom (1 angstrom = 0.1 nm), the bin width was 3 angstrom, and feature selection variance level was 2. The final QSKR model produced correlation coefficients around 0.62 on the two independent test sets. This level of accuracy is at least comparable to that of the predictive models described in literature, which are typically computationally much more expensive. Our study attempts to address the issue of predicting k(off) values in drug design. We hope that it can provide inspiration for further studies by other researchers.
引用
收藏
页数:9
相关论文
共 33 条
  • [1] [Anonymous], 2014, AMBER 14
  • [2] Does a More Precise Chemical Description of Protein-Ligand Complexes Lead to More Accurate Prediction of Binding Affinity?
    Ballester, Pedro J.
    Schreyer, Adrian
    Blundell, Tom L.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (03) : 944 - 955
  • [3] Protoss: a holistic approach to predict tautomers and protonation states in protein-ligand complexes
    Bietz, Stefan
    Urbaczek, Sascha
    Schulz, Benjamin
    Rarey, Matthias
    [J]. JOURNAL OF CHEMINFORMATICS, 2014, 6
  • [4] New approaches for computing ligand-receptor binding kinetics
    Bruce, Neil J.
    Ganotra, Gaurav K.
    Kokh, Daria B.
    Sadiq, S. Kashif
    Wadel, Rebecca C.
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2018, 49 : 1 - 10
  • [5] APPLICATION OF THE MULTIMOLECULE AND MULTICONFORMATIONAL RESP METHODOLOGY TO BIOPOLYMERS - CHARGE DERIVATION FOR DNA, RNA, AND PROTEINS
    CIEPLAK, P
    CORNELL, WD
    BAYLY, C
    KOLLMAN, PA
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 1995, 16 (11) : 1357 - 1377
  • [6] Opinion - Drug-target residence time and its implications for lead optimization
    Copeland, Robert A.
    Pompliano, David L.
    Meek, Thomas D.
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2006, 5 (09) : 730 - 739
  • [7] Drug target residence time: a misleading concept
    Folmer, Rutger H. A.
    [J]. DRUG DISCOVERY TODAY, 2018, 23 (01) : 12 - 16
  • [8] Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy
    Friesner, RA
    Banks, JL
    Murphy, RB
    Halgren, TA
    Klicic, JJ
    Mainz, DT
    Repasky, MP
    Knoll, EH
    Shelley, M
    Perry, JK
    Shaw, DE
    Francis, P
    Shenkin, PS
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (07) : 1739 - 1749
  • [9] Extra precision glide: Docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes
    Friesner, Richard A.
    Murphy, Robert B.
    Repasky, Matthew P.
    Frye, Leah L.
    Greenwood, Jeremy R.
    Halgren, Thomas A.
    Sanschagrin, Paul C.
    Mainz, Daniel T.
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2006, 49 (21) : 6177 - 6196
  • [10] Frisch M.J, 2016, Gaussian 09, Revision E.01