Building Quantitative Structure-Activity Relationship Models Using Bayesian Additive Regression Trees

被引:6
|
作者
Feng, Dai [1 ]
Syetnik, Vladimir [1 ]
Liaw, Andy [1 ]
Pratola, Matthew [2 ]
Sheridan, Robert P. [3 ]
机构
[1] Merck & Co Inc, Biomet Res, Kenilworth, NJ 07033 USA
[2] Ohio State Univ, Dept Stat, Cockins Hall,1958 Neil Ave, Columbus, OH 43210 USA
[3] Merck & Co Inc, Modeling & Informat, Kenilworth, NJ 07033 USA
基金
美国国家科学基金会;
关键词
COMPOUND CLASSIFICATION; CART; TOOL;
D O I
10.1021/acs.jcim.9b00094
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Quantitative structure-activity relationship (QSAR) is a very commonly used technique for predicting the biological activity of a molecule using information contained in the molecular descriptors. The large number of compounds and descriptors and the sparseness of descriptors pose important challenges to traditional statistical methods and machine learning (ML) algorithms (such as random forest (RF)) used in this field. Recently, Bayesian Additive Regression Trees (BART), a flexible Bayesian nonparametric regression approach, has been demonstrated to be competitive with widely used ML approaches. Instead of only focusing on accurate point estimation, BART is formulated entirely in a hierarchical Bayesian modeling framework, allowing one to also quantify uncertainties and hence to provide both point and interval estimation for a variety of quantities of interest. We studied BART as a model builder for QSAR and demonstrated that the approach tends to have predictive performance comparable to RF. More importantly, we investigated BARTs natural capability to analyze truncated (or qualified) data, generate interval estimates for molecular activities as well as descriptor importance, and conduct model diagnosis, which could not be easily handled through other approaches.
引用
收藏
页码:2642 / 2655
页数:14
相关论文
共 50 条
  • [21] Quantitative Structure-Activity Relationship Models That Stand the Test of Time
    Davis, Andrew M.
    Wood, David J.
    MOLECULAR PHARMACEUTICS, 2013, 10 (04) : 1183 - 1190
  • [22] Understanding the antifungal activity of terbinafine analogues using quantitative structure-activity relationship (QSAR) models
    Gokhale, VM
    Kulkarni, VM
    BIOORGANIC & MEDICINAL CHEMISTRY, 2000, 8 (10) : 2487 - 2499
  • [23] Toward quantitative structure-activity relationship (QSAR) models for nanoparticles
    Odziomek, Katarzyna
    Ushizima, Daniela
    Puzyn, Tomasz
    Haranczyk, Maciej
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 248
  • [24] Quantitative structure-activity relationship of the curcumin-related compounds using various regression methods
    Khazaei, Ardeshir
    Sarmasti, Negin
    Seyf, Jaber Yousefi
    JOURNAL OF MOLECULAR STRUCTURE, 2016, 1108 : 168 - 178
  • [25] Extraction of structure-activity relationships using structural regression trees.
    Helma, C
    Gottmann, E
    Pfahringer, B
    Kramer, S
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1999, 217 : U549 - U549
  • [26] Deriving quantitative structure-activity relationship models using genetic programming for drug discovery
    Neophytou, Katerina
    Nicolaou, Christos A.
    Pattichis, Constantinos S.
    Schizas, Christos N.
    2007 6TH INTERNATIONAL SPECIAL TOPIC CONFERENCE ON INFORMATION TECHNOLOGY APPLICATIONS IN BIOMEDICINE, 2007, : 180 - 183
  • [27] Molecular Modeling: Origin, Fundamental Concepts and Applications Using Structure-Activity Relationship and Quantitative Structure-Activity Relationship
    Rodrigues dos Santos, Cleydson Breno
    Lobato, Cleison Carvalho
    Costa de Sousa, Marcos Alexandre
    da Cruz Macedo, Williams Jorge
    Tavares Carvalho, Jose Carlos
    REVIEWS IN THEORETICAL SCIENCE, 2014, 2 (02) : 91 - 115
  • [28] Parallel Bayesian Additive Regression Trees
    Pratola, Matthew T.
    Chipman, Hugh A.
    Gattiker, James R.
    Higdon, David M.
    McCulloch, Robert
    Rust, William N.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2014, 23 (03) : 830 - 852
  • [29] BART: BAYESIAN ADDITIVE REGRESSION TREES
    Chipman, Hugh A.
    George, Edward I.
    McCulloch, Robert E.
    ANNALS OF APPLIED STATISTICS, 2010, 4 (01): : 266 - 298
  • [30] Building Highly Reliable Quantitative Structure-Activity Relationship Classification Models Using the Rivality Index Neighborhood Algorithm with Feature Selection
    Luque Ruiz, Irene
    Angel Gomez-Nieto, Miguel
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (01) : 133 - 151