Mining a chemical database for fragment co-occurrence:: Discovery of "chemical cliches"

被引:39
|
作者
Lameijer, EW
Kok, JN
Bäck, T
Ijzerman, AP
机构
[1] Leiden Univ, Leiden Amsterdam Ctr Drug Res, Div Med Chem, NL-2300 RA Leiden, Netherlands
[2] Leiden Univ, LIACS, NL-2333 CA Leiden, Netherlands
[3] NuTech Solut, D-44227 Dortmund, Germany
关键词
D O I
10.1021/ci050370c
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Nowadays millions of different compounds are known, their structures stored in electronic databases. Analysis of these data could yield valuable insights into the laws of chemistry and the habits of chemists. We have therefore explored the public database of the National Cancer Institute (> 250 000 compounds) by pattern searching. We split the molecules of this database into fragments to find out which fragments exist, how frequent they are, and whether the occurrence of one fragment in a molecule is related to the occurrence of another, nonoverlapping fragment. It turns out that some fragments and combinations of fragments are so frequent that they can be called "chemical cliches". We believe that the fragment data can give insight into the chemical space explored so far by synthesis. The lists of fragments and their (co-)occurrences can help create novel chemical compounds by (i) systematically listing the most popular and therefore most easily used substituents and ring systems for synthesizing new compounds, (ii) being an easily accessible repository for rarer fragments Suitable for lead compound optimization, and (iii) pointing out some of the yet unexplored parts of chemical space.
引用
收藏
页码:553 / 562
页数:10
相关论文
共 50 条
  • [21] Composite Spatio-Temporal Co-occurrence Pattern Mining
    Zhang, Zhongnan
    Wu, Weili
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2008, 5258 : 454 - +
  • [22] Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase
    Tristan de Rond
    Julia E. Asay
    Bradley S. Moore
    Nature Chemical Biology, 2021, 17 : 794 - 799
  • [23] Partial spatio-temporal co-occurrence pattern mining
    Celik, Mete
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 44 (01) : 27 - 49
  • [24] Partial spatio-temporal co-occurrence pattern mining
    Mete Celik
    Knowledge and Information Systems, 2015, 44 : 27 - 49
  • [25] Context-Aware Discovery of Visual Co-Occurrence Patterns
    Wang, Hongxing
    Yuan, Junsong
    Wu, Ying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (04) : 1805 - 1819
  • [26] Mixed-drove spatiotemporal co-occurrence pattern mining
    Celik, Mete
    Shekhar, Shashi
    Rogers, James P.
    Shine, James A.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (10) : 1322 - 1335
  • [27] Text Topic Mining Based on LDA and Co-occurrence Theory
    Wu Maowen
    Zhang CaiDong
    Lan Weiyao
    Wu QingQiang
    PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 525 - 528
  • [28] Machine discovery based on the co-occurrence of references in a search engine
    Murata, T
    DISCOVERY SCIENCE, PROCEEDINGS, 1999, 1721 : 220 - 229
  • [29] Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase
    de Rond, Tristan
    Asay, Julia E.
    Moore, Bradley S.
    NATURE CHEMICAL BIOLOGY, 2021, 17 (07) : 794 - 799
  • [30] Language Model Co-occurrence Linking for Interleaved Activity Discovery
    Rogers, Eoin
    Kelleher, John D.
    Ross, Robert J.
    MACHINE LEARNING FOR NETWORKING (MLN 2019), 2020, 12081 : 70 - 84