Chemical similarity;
Significance;
Sequence alignment;
Pharmacophore;
Virtual screening;
LOCAL SEQUENCE ALIGNMENT;
DEVELOPMENT KIT CDK;
SOURCE [!text type='JAVA']JAVA[!/text] LIBRARY;
SMALL MOLECULES;
UNIVERSE DATABASE;
STATISTICAL DISTRIBUTION;
BIOLOGICAL INTEREST;
DRUG DESIGN;
SCORES;
DISTRIBUTIONS;
D O I:
10.1002/minf.201300021
中图分类号:
R914 [药物化学];
学科分类号:
100701 ;
摘要:
Previously, we proposed a ligand-based virtual screening technique (PhAST) based on global alignment of linearized interaction patterns. Here, we applied techniques developed for similarity assessment in local sequence alignments to our method resulting in p-values for chemical similarity. We compared two sampling strategies, a simple sampling strategy and a Markov Chain Monte Carlo (MCMC) method, and investigated the similarity of sampled distributions to Gaussian, Gumbel, modified Gumbel, and Gamma distributions. The Gumbel distribution with a Gaussian correction term was identified as the most similar to the observed empirical distributions. These techniques were applied in retrospective screenings on a drug-like dataset. Obtained p-values were adjusted to the size of the screening library with four different methods. Evaluation of E-value thresholds corroborated the Bonferroni correction as a preferred means to identify significant chemical similarity with PhAST. An online version of PhAST with significance estimation is available at http://modlab-cadd.ethz.ch/.