Chemical language models for molecular design

被引:3
|
作者
Bajorath, Juergen [1 ,2 ,3 ]
机构
[1] Rheinische Friedrich Wilhelms Univ Bonn, Bonn Aachen Int Ctr Informat Technol, Dept Life Sci Informat, Bonn, Germany
[2] Rheinische Friedrich Wilhelms Univ Bonn, Lamarr Inst Machine Learning & Artificial Intellig, Bonn, Germany
[3] Rheinische Friedrich Wilhelms Univ Bonn, Bonn Aachen Int Ctr Informat Technol, Dept Life Sci Informat, Friedrich Hirzebruch Allee 5-6, D-53115 Bonn, Germany
关键词
drug design; language models; recurrent neural networks; encoder-decoder frameworks; transformers; attention mechanisms; TRANSFORMER; DISCOVERY;
D O I
10.1002/minf.202300288
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In drug discovery, chemical language models (CLMs) originating from natural language processing offer new opportunities for molecular design. CLMs have been developed using recurrent neural network (RNN) or transformer architectures. For the predictive performance of RNN-based encoder-decoder frameworks and transformers, attention mechanisms play a central role. Among others, emerging application areas for CLMs include constrained generative modeling and the prediction of chemical reactions or drug-target interactions. Since CLMs are applicable to any compound or target data that can be presented in a sequential format and tokenized, mappings of different types of sequences can be learned. For example, active compounds can be predicted from protein sequence motifs. Novel off-the-beat-path applications can also be considered. For example, analogue series from medicinal chemistry can be perceived and represented as chemical sequences and extended with new compounds using CLMs. Herein, methodological features of CLMs and different applications are discussed. image
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Chemical and biological language models in molecular design: opportunities, risks and scientific reasoning
    Bajorath, Juergen
    FUTURE SCIENCE OA, 2024, 10 (01):
  • [2] Leveraging molecular structure and bioactivity with chemical language models for de novo drug design
    Moret, Michael
    Pachon Angona, Irene
    Cotos, Leandro
    Yan, Shen
    Atz, Kenneth
    Brunner, Cyrill
    Baumgartner, Martin
    Grisoni, Francesca
    Schneider, Gisbert
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [3] Leveraging molecular structure and bioactivity with chemical language models for de novo drug design
    Michael Moret
    Irene Pachon Angona
    Leandro Cotos
    Shen Yan
    Kenneth Atz
    Cyrill Brunner
    Martin Baumgartner
    Francesca Grisoni
    Gisbert Schneider
    Nature Communications, 14
  • [4] Bayesian molecular design with a chemical language model
    Hisaki Ikebata
    Kenta Hongo
    Tetsu Isomura
    Ryo Maezono
    Ryo Yoshida
    Journal of Computer-Aided Molecular Design, 2017, 31 : 379 - 391
  • [5] Large Language Models as Molecular Design Engines
    Bhattacharya, Debjyoti
    Cassady, Harrison J.
    Hickner, Michael A.
    Reinhart, Wesley F.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (18) : 7086 - 7096
  • [7] Unlocking comprehensive molecular design across all scenarios with large language model and unordered chemical language
    Yue, Jie
    Peng, Bingxin
    Chen, Yu
    Jin, Jieyu
    Zhao, Xinda
    Shen, Chao
    Ji, Xiangyang
    Hsieh, Chang-Yu
    Song, Jianfei
    Hou, Tingjun
    Deng, Yafeng
    Wang, Jike
    CHEMICAL SCIENCE, 2024, 15 (34) : 13727 - 13740
  • [8] Advances in machine learning with chemical language models in molecular property and reaction outcome predictions
    Das, Manajit
    Ghosh, Ankit
    Sunoj, Raghavan B.
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2024, 45 (14) : 1160 - 1176
  • [9] Standardizing chemical compounds with language models
    Cretu, Miruna T.
    Toniato, Alessandra
    Thakkar, Amol
    Debabeche, Amin A.
    Laino, Teodoro
    Vaucher, Alain C.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (03):
  • [10] MOLECULAR MODELS AND CHEMICAL DIDATICS
    VOGTLE, F
    NEUMANN, P
    CHEMIKER-ZEITUNG, 1974, 98 (08): : 375 - 386