Bayesian similarity searching in high-dimensional descriptor spaces combined with Kullback-Leibler descriptor divergence analysis

被引:11
|
作者
Vogt, Martin [1 ]
Bajorath, Jijrgen [1 ]
机构
[1] Rhein Freidrich Wilhelms Univ Bonn, Dept Life Sci Informat, B IT, D-53113 Bonn, Germany
关键词
D O I
10.1021/ci700333t
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We investigate an approach that combines Bayesian modeling of probability distributions of descriptor values of active and database molecules with Kullback-Leibler analysis of the divergence between these distributions. The methodology is used for Bayesian screening and also to predict compound recall rates. In our study, we analyze two fundamental approximations underlying the Bayesian screening approach: the assumption that descriptors are independent of each other and, furthermore, that their data set values follow normal distributions. In addition, we calculate Kullback-Leibler divergence for single descriptors, rather than multiple-feature distributions, in order to prioritize descriptors for screening calculations. The results show that descriptor correlation effects, violating the assumption of feature independence, can lead to notable reduction of compound recall in Bayesian screening. Controlling descriptor correlation effects play a much more significant role for achieving high recall rates than approximating descriptor distributions by Gaussians. Furthermore, Kullback-Leibler divergence analysis is shown to systematically identify descriptors that are the most relevant for the outcome of Bayesian screening calculations.
引用
收藏
页码:247 / 255
页数:9
相关论文
共 13 条
  • [1] Development of a Fingerprint Reduction Approach for Bayesian Similarity Searching Based on Kullback-Leibler Divergence Analysis
    Nisius, Britta
    Vogt, Martin
    Bajorath, Juergen
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (06) : 1347 - 1358
  • [2] Bayesian interpretation of a distance function for navigating high-dimensional descriptor spaces
    Vogt, Martin
    Godden, Jeffrey W.
    Bajorath, Juergen
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (01) : 39 - 46
  • [3] Bayesian case influence analysis for GARCH models based on Kullback-Leibler divergence
    Hao, Hong-Xia
    Lin, Jin-Guan
    Wang, Hong-Xia
    Huang, Xing-Fang
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2016, 45 (04) : 595 - 609
  • [4] A convex Kullback-Leibler divergence and critical-descriptor prototypes for semi-supervised few-shot learning
    Liu, Yukun
    Shi, Daming
    APPLIED INTELLIGENCE, 2025, 55 (05)
  • [5] Systematic Bayesian posterior analysis guided by Kullback-Leibler divergence facilitates hypothesis formation
    Huber, Holly A.
    Georgia, Senta K.
    Finley, Stacey D.
    JOURNAL OF THEORETICAL BIOLOGY, 2023, 558
  • [6] Distance phenomena in high-dimensional chemical descriptor spaces: consequences for similarity-based approaches
    M Rupp
    G Schneider
    Chemistry Central Journal, 3 (Suppl 1)
  • [7] Distance Phenomena in High-Dimensional Chemical Descriptor Spaces: Consequences for Similarity-Based Approaches
    Rupp, Matthias
    Schneider, Petra
    Schneider, Gisbert
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2009, 30 (14) : 2285 - 2296
  • [8] HIGH DIMENSIONAL KULLBACK-LEIBLER DIVERGENCE FOR GRASSLAND MANAGEMENT PRACTICES CLASSIFICATION FROM HIGH RESOLUTION SATELLITE IMAGE TIME SERIES
    Lopes, Mailys
    Fauvel, Mathieu
    Girard, Stephane
    Sheeren, David
    2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 3342 - 3345
  • [9] Fault detection and identification using a Kullback-Leibler divergence based multi-block principal component analysis and bayesian inference
    Wang, Bei
    Jiang, Qingchao
    Yan, Xuefeng
    KOREAN JOURNAL OF CHEMICAL ENGINEERING, 2014, 31 (06) : 930 - 943
  • [10] 3-Way-Trees: A similarity search method for high-dimensional descriptor matching
    Valle, Eduardo
    Cord, Matthieu
    Philipp-Foliguet, Sylvie
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 173 - +