RetroCaptioner: beyond attention in end-to-end retrosynthesis transformer via contrastively captioned learnable graph representation

被引:0
|
作者
Liu, Xiaoyi [1 ,2 ]
Ai, Chengwei [3 ]
Yang, Hongpeng [4 ]
Dong, Ruihan [5 ]
Tang, Jijun [6 ,7 ]
Zheng, Shuangjia [8 ]
Guo, Fei [3 ]
机构
[1] Beijing Univ Chinese Med, Sch Chinese Mat Med, Beijing, Peoples R China
[2] Minist Educ, Engn Res Ctr Pharmaceut Chinese Mat Med & New Drug, Beijing 100102, Peoples R China
[3] Cent South Univ, Comp Sci & Engn, 932 Lushan St, Changsha 410083, Peoples R China
[4] Univ South Carolina, Comp Sci & Engn, Columbia, SC 29208 USA
[5] Peking Univ, Acad Adv Interdisciplinary Studies, Beijing 100871, Peoples R China
[6] Shenzhen Univ Adv Technol, Fac Comp Sci & Control Engn, Shenzhen 518055, Peoples R China
[7] Chinese Acad Sci, Shenzhen Inst Adv Technol, 1068 Xueyuan Ave, Nanshan 518055, Peoples R China
[8] Shanghai Jiao Tong Univ, Global Inst Future Technol, 800 Dongchuan Rd, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1093/bioinformatics/btae561
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Retrosynthesis identifies available precursor molecules for various and novel compounds. With the advancements and practicality of language models, Transformer-based models have increasingly been used to automate this process. However, many existing methods struggle to efficiently capture reaction transformation information, limiting the accuracy and applicability of their predictions.Results We introduce RetroCaptioner, an advanced end-to-end, Transformer-based framework featuring a Contrastive Reaction Center Captioner. This captioner guides the training of dual-view attention models using a contrastive learning approach. It leverages learned molecular graph representations to capture chemically plausible constraints within a single-step learning process. We integrate the single-encoder, dual-encoder, and encoder-decoder paradigms to effectively fuse information from the sequence and graph representations of molecules. This involves modifying the Transformer encoder into a uni-view sequence encoder and a dual-view module. Furthermore, we enhance the captioning of atomic correspondence between SMILES and graphs. Our proposed method, RetroCaptioner, achieved outstanding performance with 67.2% in top-1 and 93.4% in top-10 exact matched accuracy on the USPTO-50k dataset, alongside an exceptional SMILES validity score of 99.4%. In addition, RetroCaptioner has demonstrated its reliability in generating synthetic routes for the drug protokylol.Availability and implementation The code and data are available at https://github.com/guofei-tju/RetroCaptioner.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
    Wan, Yue
    Hsieh, Chang-Yu
    Liao, Benben
    Zhang, Shengyu
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing
    Zhong, Weihe
    Yang, Ziduo
    Chen, Calvin Yu-Chian
    [J]. NATURE COMMUNICATIONS, 2023, 14 (01)
  • [3] Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing
    Weihe Zhong
    Ziduo Yang
    Calvin Yu-Chian Chen
    [J]. Nature Communications, 14
  • [4] SGTR: End-to-end Scene Graph Generation with Transformer
    Li, Rongjie
    Zhang, Songyang
    He, Xuming
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19464 - 19474
  • [5] A Novel End-to-End Transformer for Scene Graph Generation
    Ren, Chengkai
    Liu, Xiuhua
    Cao, Mengyuan
    Zhang, Jian
    Wang, Hongwei
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] An End-to-End Multiplex Graph Neural Network for Graph Representation Learning
    Liang, Yanyan
    Zhang, Yanfeng
    Gao, Dechao
    Xu, Qian
    [J]. IEEE ACCESS, 2021, 9 : 58861 - 58869
  • [7] SGTR plus : End-to-End Scene Graph Generation With Transformer
    Li, Rongjie
    Zhang, Songyang
    He, Xuming
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (04) : 2191 - 2205
  • [8] End-to-End Video Scene Graph Generation With Temporal Propagation Transformer
    Zhang, Yong
    Pan, Yingwei
    Yao, Ting
    Huang, Rui
    Mei, Tao
    Chen, Chang-Wen
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1613 - 1625
  • [9] End-to-End Single Shot Detector Using Graph-Based Learnable Duplicate Removal
    Ding, Shuxiao
    Rehder, Eike
    Schneider, Lukas
    Cordts, Marius
    Gall, Juergen
    [J]. PATTERN RECOGNITION, DAGM GCPR 2022, 2022, 13485 : 375 - 389
  • [10] IMPROVING MANDARIN END-TO-END SPEECH SYNTHESIS BY SELF-ATTENTION AND LEARNABLE GAUSSIAN BIAS
    Yang, Fengyu
    Yang, Shan
    Zhu, Pengcheng
    Yan, Pengju
    Xie, Lei
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 208 - 213