High performance binding affinity prediction with a Transformer-based surrogate model

被引:0
|
作者
Vasan, Archit [1 ]
Gokdemir, Ozan [1 ,2 ]
Brace, Alexander [1 ,2 ]
Ramanathan, Arvind [1 ,2 ]
Brettin, Thomas [1 ]
Stevens, Rick [1 ,2 ]
Vishwanath, Venkatram [1 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Univ Chicago, Chicago, IL 60637 USA
基金
美国国家卫生研究院;
关键词
drug discovery; virtual screening; docking surrogates; high performance computing; transformers; SMILES;
D O I
10.1109/IPDPSW63119.2024.00114
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the current paradigm of drug discovery pipelines, identification of compounds that bind to a target with high affinity constitutes the first step. This is typically performed using resource-intensive experimental methods to screen vast chemical search spaces - a key bottleneck in the drug-discovery pipeline. To streamline this process, highly-scalable computational screening methods with acceptable fidelity are needed to screen larger portions of the chemical search space and identify promising candidates to he validated using experiments. Machine learning methods, namely, surrogate models have recently evolved into favorable alternatives to perform this computational screening. In this work, we present Simple SMILES Transformer (SST), an accurate and highly-scalable binding affinity prediction method that approximates the computationally-intensive molecular docking process using an encoder-only Transformer architecture. We benchmark our model against two baselines that feature fundamentally different approaches to docking surrogates: RegGO, a MORDRED fingerprint based multi-layer perceptron model, and Chemprop, a directed message-passing graph neural network. Unlike Chemprop and RegGO, our method operates solely on the SMILES representation of molecules without needing additional featurization, which leads to reduced preprocessing overhead, higher inference throughput and thus better scalability. We train SST in a distributed fashion on the Polaris supercomputer at the Argonne Leadership Computing Facility (ALCF). We then deploy it at an unprecedented scale for inference across 256 compute nodes of ALCF's Aurora supercomputer to screen 22 billion compounds in 40 minutes in search of hits with high binding affinity to oncoprotein RtcB ligase. SST predictions emphasize several molecular motifs that have previously been confirmed to interact with residues in their target binding pockets.
引用
下载
收藏
页码:571 / 580
页数:10
相关论文
共 50 条
  • [41] SST: A Simplified Swin Transformer-based Model for Taxi Destination Prediction based on Existing Trajectory
    Wang, Zepu
    Sun, Yifei
    Lei, Zhiyu
    Zhu, Xincheng
    Sun, Peng
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 1404 - 1409
  • [42] Improving transformer-based acoustic model performance using sequence discriminative training
    Lee, Chae-Won
    Chang, Joon-Hyuk
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2022, 41 (03): : 335 - 341
  • [43] Transformer-Based Model for Electrical Load Forecasting
    L'Heureux, Alexandra
    Grolinger, Katarina
    Capretz, Miriam A. M.
    ENERGIES, 2022, 15 (14)
  • [44] SIT: A Spatial Interaction-Aware Transformer-Based Model for Freeway Trajectory Prediction
    Li, Xiaolong
    Xia, Jing
    Chen, Xiaoyong
    Tan, Yongbin
    Chen, Jing
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (02)
  • [45] Transformer-based settlement prediction model of pile composite foundation under embankment loading
    Gao, Song
    Chen, Changfu
    Jiang, Xueqin
    Zhu, Shimin
    Cai, Huan
    Li, Wei
    COMPUTERS AND GEOTECHNICS, 2024, 176
  • [46] CRISPert: A Transformer-Based Model for CRISPR-Cas Off-Target Prediction
    Pargeter, William Jobson
    Backofen, Rolf
    Tran, Van Dinh
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 92 - 104
  • [47] Transformer-Based Neural Surrogate for Link-Level Path Loss Prediction from Variable-Sized Maps
    Hehn, Thomas M.
    Orekondy, Tribhuvanesh
    Shental, Ori
    Behboodi, Arash
    Bucheli, Juan
    Doshi, Akash
    Namgoong, June
    Yoo, Taesang
    Sampath, Ashwin
    Soriaga, Joseph B.
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 4804 - 4809
  • [48] High entropy alloy property predictions using a transformer-based language model
    Spyros Kamnis
    Konstantinos Delibasis
    Scientific Reports, 15 (1)
  • [49] RM-Transformer: A Transformer-based Model for Mandarin Speech Recognition
    Lu, Xingyu
    Hu, Jianguo
    Li, Shenhao
    Ding, Yanyu
    2022 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE (CCAI 2022), 2022, : 194 - 198
  • [50] Meta-learning for transformer-based prediction of potent compounds
    Chen, Hengwei
    Bajorath, Juergen
    SCIENTIFIC REPORTS, 2023, 13 (01)