Linguistic measures of chemical diversity and the “keywords” of molecular collections

被引:0
|
作者
Michał Woźniak
Agnieszka Wołos
Urszula Modrzyk
Rafał L. Górski
Jan Winkowski
Michał Bajczyk
Sara Szymkuć
Bartosz A. Grzybowski
Maciej Eder
机构
[1] Polish Academy of Sciences,Institute of Polish Language
[2] Polish Academy of Sciences,Institute of Organic Chemistry
[3] Center for Soft and Living Matter of Korea’s Institute for Basic Science (IBS),Department of Chemistry
[4] Ulsan National Institute of Science and Technology,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections (“corpora”), including those deposited on the Internet – indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic “chemical words” that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular “keywords” by which such collections are best characterized and annotated.
引用
收藏
相关论文
共 50 条
  • [1] Linguistic measures of chemical diversity and the "keywords" of molecular collections
    Wozniak, Michal
    Wolos, Agnieszka
    Modrzyk, Urszula
    Gorski, Rafal L.
    Winkowski, Jan
    Bajczyk, Michal
    Szymkuc, Sara
    Grzybowski, Bartosz A.
    Eder, Maciej
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [2] AN EXTENSION OF GREENBERG LINGUISTIC DIVERSITY MEASURES
    LIEBERSON, S
    [J]. LANGUAGE, 1964, 40 (04) : 526 - 531
  • [3] Semantic Measures for Keywords Extraction
    Colla, Davide
    Mensa, Enrico
    Radicioni, Daniele P.
    [J]. AI*IA 2017 ADVANCES IN ARTIFICIAL INTELLIGENCE, 2017, 10640 : 128 - 140
  • [4] THE CHEMICAL GENERATION OF MOLECULAR DIVERSITY
    PAVIA, MR
    [J]. CHIMICA OGGI-CHEMISTRY TODAY, 1995, 13 (7-8) : 16 - 18
  • [5] Atomic Diversity, Molecular Diversity, and Chemical Diversity: The Concept of Chemodiversity
    Testa, Bernard
    Vistoli, Giulio
    Pedretti, Alessandro
    Bojarski, Andrzej J.
    [J]. CHEMISTRY & BIODIVERSITY, 2009, 6 (08) : 1145 - 1151
  • [6] CONTINUOUS COLLECTIONS OF MEASURES
    BLUMENTHAL, RM
    CORSON, HH
    [J]. ANNALES DE L INSTITUT FOURIER, 1970, 20 (02) : 193 - +
  • [7] Molecular diversity and representativity in chemical databases
    Bayada, DM
    Hamersma, H
    van Geerestein, VJ
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (01): : 1 - 10
  • [8] ... and linguistic diversity
    Weeks, DE
    O'Connell, JR
    Schmidtová, Z
    [J]. NATURE, 1998, 391 (6663) : 118 - 118
  • [9] ⃛and linguistic diversity
    Daniel E. Weeks
    Jeffrey R. O'Connell
    Zora Schmidtová
    [J]. Nature, 1998, 391 : 118 - 118
  • [10] Hominid phylogeny: morphological and molecular measures of diversity.
    Eckhardt, RB
    [J]. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2001, : 61 - 62