Molecular diversity and representativity in chemical databases

被引:84
|
作者
Bayada, DM [1 ]
Hamersma, H [1 ]
van Geerestein, VJ [1 ]
机构
[1] NV Organon, Dept Mol Design & Informat, NL-5340 BH Oss, Netherlands
关键词
D O I
10.1021/ci980109e
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
It is now common practice in the pharmaceutical industry to use molecular diversity selection methods. With the advent of high throughput screening and combinatorial chemistry, compounds must be rationally selected from databases of hundreds of thousands of compounds to be tested for several biological activities. We explore the differences between diversity and representativity. Validation runs were made for different diversity selection methods (such as the MaxMin function), several representativity techniques (selection of compounds closest to centroids of clusters, Kohonen neural networks, nonlinear scaling of descriptor values), and various types of descriptors (topological and 3D fingerprints) including some validated whole-molecule numerical descriptors that were chosen for their correlation with biological activities. We find that only clustering based on fingerprints or on whole-molecule descriptors gives results consistently superior to random selection in extracting a diverse set of activities from a file with potential drug molecules. The results further indicate that clustering selection from fingerprints is biased toward small molecules, a behavior that might partly explain its success over other types of methods. Using numerical descriptors instead of fingerprints removes this bias without penalising performance too much.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] Linguistic measures of chemical diversity and the “keywords” of molecular collections
    Michał Woźniak
    Agnieszka Wołos
    Urszula Modrzyk
    Rafał L. Górski
    Jan Winkowski
    Michał Bajczyk
    Sara Szymkuć
    Bartosz A. Grzybowski
    Maciej Eder
    [J]. Scientific Reports, 8
  • [22] New approach to molecular docking and its application to virtual screening of chemical databases
    Baxter, CA
    Murray, CW
    Waszkowycz, B
    Li, J
    Sykes, RA
    Bone, RGA
    Perkins, TDJ
    Wylie, W
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02): : 254 - 262
  • [23] Investigation of molecular chirality in 3D chemical structure databases.
    Hu, ZJ
    Southerland, WM
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2004, 227 : U685 - U685
  • [25] Issues of representativity: Simulating the effects of kinship patterns on genetic diversity in archaeological samples
    Figueiro, Gonzalo
    [J]. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2020, 171 : 86 - 86
  • [26] CHEMICAL INFORMATION IN NON-CHEMICAL DATABASES .5. TEXTILES DATABASES
    DUELTGEN, RR
    [J]. DATABASE, 1983, 6 (04): : 96 - 97
  • [27] D-Tools: Open resources for chemical space and global diversity analysis of compound databases
    Medina-Franco, Jose
    Gonzalez-Medina, Mariana
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 255
  • [28] Combinatorial chemistry and molecular diversity tools for molecular diversification and their applications in chemical biology
    Silverman, Scott K.
    Hergenrother, Paul J.
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2006, 10 (03) : 185 - 187
  • [29] Chemical databases: an overview of selected databases and evaluation methods
    Voigt, K
    Welzl, G
    [J]. ONLINE INFORMATION REVIEW, 2002, 26 (03) : 172 - 192
  • [30] Molecular interaction databases
    Orchard, Sandra
    [J]. PROTEOMICS, 2012, 12 (10) : 1656 - 1662