Molecular diversity and representativity in chemical databases

被引:84
|
作者
Bayada, DM [1 ]
Hamersma, H [1 ]
van Geerestein, VJ [1 ]
机构
[1] NV Organon, Dept Mol Design & Informat, NL-5340 BH Oss, Netherlands
关键词
D O I
10.1021/ci980109e
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
It is now common practice in the pharmaceutical industry to use molecular diversity selection methods. With the advent of high throughput screening and combinatorial chemistry, compounds must be rationally selected from databases of hundreds of thousands of compounds to be tested for several biological activities. We explore the differences between diversity and representativity. Validation runs were made for different diversity selection methods (such as the MaxMin function), several representativity techniques (selection of compounds closest to centroids of clusters, Kohonen neural networks, nonlinear scaling of descriptor values), and various types of descriptors (topological and 3D fingerprints) including some validated whole-molecule numerical descriptors that were chosen for their correlation with biological activities. We find that only clustering based on fingerprints or on whole-molecule descriptors gives results consistently superior to random selection in extracting a diverse set of activities from a file with potential drug molecules. The results further indicate that clustering selection from fingerprints is biased toward small molecules, a behavior that might partly explain its success over other types of methods. Using numerical descriptors instead of fingerprints removes this bias without penalising performance too much.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [1] Diversity and representativity in (biased) databases
    Bayada, DM
    Organon, NV
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1998, 215 : U551 - U551
  • [2] Molecular diversity in chemical databases: Comparison of medicinal chemistry knowledge bases and databases of commercially available compounds
    Cummins, DJ
    Andrews, CW
    Bentley, JA
    Cory, M
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (04): : 750 - 763
  • [3] Molecular dataset diversity indices and their applications to comparison of chemical databases and QSAR analysis
    Golbraikh, A
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02): : 414 - 425
  • [4] Comparison of chemical databases:: Analysis of molecular diversity with Self Organising Maps (SOM)
    Bernard, P
    Golbraikh, A
    Kireev, D
    Chrétien, JR
    Rozhkova, N
    [J]. ANALUSIS, 1998, 26 (08) : 333 - 341
  • [5] Molecular diversity in chemical databases: Comparison of medicinal chemistry knowledge bases and databases of commercially available compounds.
    Cummins, DJ
    Andrews, CW
    Bentley, JA
    Cory, M
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1996, 211 : 24 - CINF
  • [6] New application to estimate the diversity of molecular databases
    Weidlich, Iwona
    Filippov, Igor
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [7] Representativity in times of diversity: The political representation of women
    Celis, Karen
    [J]. WOMENS STUDIES INTERNATIONAL FORUM, 2013, 41 : 179 - 186
  • [8] THE CHEMICAL GENERATION OF MOLECULAR DIVERSITY
    PAVIA, MR
    [J]. CHIMICA OGGI-CHEMISTRY TODAY, 1995, 13 (7-8) : 16 - 18
  • [9] CHARACTERIZING THE GEOMETRIC DIVERSITY OF FUNCTIONAL-GROUPS IN CHEMICAL DATABASES
    BOYD, SM
    BEVERLEY, M
    NORSKOV, L
    HUBBARD, RE
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1995, 9 (05) : 417 - 424
  • [10] Atomic Diversity, Molecular Diversity, and Chemical Diversity: The Concept of Chemodiversity
    Testa, Bernard
    Vistoli, Giulio
    Pedretti, Alessandro
    Bojarski, Andrzej J.
    [J]. CHEMISTRY & BIODIVERSITY, 2009, 6 (08) : 1145 - 1151