Mining a chemical database for fragment co-occurrence:: Discovery of "chemical cliches"

被引:39
|
作者
Lameijer, EW
Kok, JN
Bäck, T
Ijzerman, AP
机构
[1] Leiden Univ, Leiden Amsterdam Ctr Drug Res, Div Med Chem, NL-2300 RA Leiden, Netherlands
[2] Leiden Univ, LIACS, NL-2333 CA Leiden, Netherlands
[3] NuTech Solut, D-44227 Dortmund, Germany
关键词
D O I
10.1021/ci050370c
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Nowadays millions of different compounds are known, their structures stored in electronic databases. Analysis of these data could yield valuable insights into the laws of chemistry and the habits of chemists. We have therefore explored the public database of the National Cancer Institute (> 250 000 compounds) by pattern searching. We split the molecules of this database into fragments to find out which fragments exist, how frequent they are, and whether the occurrence of one fragment in a molecule is related to the occurrence of another, nonoverlapping fragment. It turns out that some fragments and combinations of fragments are so frequent that they can be called "chemical cliches". We believe that the fragment data can give insight into the chemical space explored so far by synthesis. The lists of fragments and their (co-)occurrences can help create novel chemical compounds by (i) systematically listing the most popular and therefore most easily used substituents and ring systems for synthesizing new compounds, (ii) being an easily accessible repository for rarer fragments Suitable for lead compound optimization, and (iii) pointing out some of the yet unexplored parts of chemical space.
引用
收藏
页码:553 / 562
页数:10
相关论文
共 50 条
  • [1] Literature-based generation of hypotheses on chemical composition using database co-occurrence of chemical compounds
    Milman, BL
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (05) : 1153 - 1158
  • [2] Co-occurrence of chemical and structural defenses in the gorgonian corals of Guam
    Puglisi, MP
    Paul, VJ
    Biggs, J
    Slattery, M
    MARINE ECOLOGY PROGRESS SERIES, 2002, 239 : 105 - 114
  • [3] Mining the chemical universe database GDB-17 for drug discovery
    Reymond, Jean-Louis
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251
  • [4] CoPFun: an urban co-occurrence pattern mining scheme based on regional function discovery
    Kong, Xiangjie
    Li, Menglin
    Li, Jianxin
    Tian, Kaiqi
    Hu, Xiping
    Xia, Feng
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (03): : 1029 - 1054
  • [5] CoPFun: an urban co-occurrence pattern mining scheme based on regional function discovery
    Xiangjie Kong
    Menglin Li
    Jianxin Li
    Kaiqi Tian
    Xiping Hu
    Feng Xia
    World Wide Web, 2019, 22 : 1029 - 1054
  • [6] Co-occurrence and potential chemical competition of phosphorus and silicon in lake sediment
    Hartikainen, H
    Pitkanen, M
    Kairesalo, T
    Tuominen, L
    WATER RESEARCH, 1996, 30 (10) : 2472 - 2478
  • [7] Biogeographical Analysis of Chemical Co-Occurrence Data to Identify Priorities for Mixtures Research
    Tornero-Velez, Rogelio
    Egeghy, Peter P.
    Hubal, Elaine A. Cohen
    RISK ANALYSIS, 2012, 32 (02) : 224 - 236
  • [8] Binary co-occurrence matrix in image database indexing
    Kunttu, I
    Lepistö, L
    Rauhamaa, J
    Visa, A
    IMAGE ANALYSIS, PROCEEDINGS, 2003, 2749 : 1090 - 1097
  • [9] Mining Regional Co-Occurrence Patterns for Image Classification
    Ji, Zhihang
    Wu, Sining
    Wang, Fan
    Xu, Lijuan
    Yang, Yan
    Hu, Xiaopeng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [10] Mining spatiotemporal co-occurrence patterns in solar datasets
    Aydin, B.
    Kempton, D.
    Akkineni, V.
    Angryk, R.
    Pillai, K. G.
    ASTRONOMY AND COMPUTING, 2015, 13 : 136 - 144