High performance binding affinity prediction with a Transformer-based surrogate model

被引:0
|
作者
Vasan, Archit [1 ]
Gokdemir, Ozan [1 ,2 ]
Brace, Alexander [1 ,2 ]
Ramanathan, Arvind [1 ,2 ]
Brettin, Thomas [1 ]
Stevens, Rick [1 ,2 ]
Vishwanath, Venkatram [1 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Univ Chicago, Chicago, IL 60637 USA
基金
美国国家卫生研究院;
关键词
drug discovery; virtual screening; docking surrogates; high performance computing; transformers; SMILES;
D O I
10.1109/IPDPSW63119.2024.00114
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the current paradigm of drug discovery pipelines, identification of compounds that bind to a target with high affinity constitutes the first step. This is typically performed using resource-intensive experimental methods to screen vast chemical search spaces - a key bottleneck in the drug-discovery pipeline. To streamline this process, highly-scalable computational screening methods with acceptable fidelity are needed to screen larger portions of the chemical search space and identify promising candidates to he validated using experiments. Machine learning methods, namely, surrogate models have recently evolved into favorable alternatives to perform this computational screening. In this work, we present Simple SMILES Transformer (SST), an accurate and highly-scalable binding affinity prediction method that approximates the computationally-intensive molecular docking process using an encoder-only Transformer architecture. We benchmark our model against two baselines that feature fundamentally different approaches to docking surrogates: RegGO, a MORDRED fingerprint based multi-layer perceptron model, and Chemprop, a directed message-passing graph neural network. Unlike Chemprop and RegGO, our method operates solely on the SMILES representation of molecules without needing additional featurization, which leads to reduced preprocessing overhead, higher inference throughput and thus better scalability. We train SST in a distributed fashion on the Polaris supercomputer at the Argonne Leadership Computing Facility (ALCF). We then deploy it at an unprecedented scale for inference across 256 compute nodes of ALCF's Aurora supercomputer to screen 22 billion compounds in 40 minutes in search of hits with high binding affinity to oncoprotein RtcB ligase. SST predictions emphasize several molecular motifs that have previously been confirmed to interact with residues in their target binding pockets.
引用
下载
收藏
页码:571 / 580
页数:10
相关论文
共 50 条
  • [1] Vision Transformer-Based Photovoltaic Prediction Model
    Kang, Zaohui
    Xue, Jizhong
    Lai, Chun Sing
    Wang, Yu
    Yuan, Haoliang
    Xu, Fangyuan
    ENERGIES, 2023, 16 (12)
  • [2] PiTE: TCR-epitope Binding Affinity Prediction Pipeline using Transformer-based Sequence Encoder
    Zhang, Pengfei
    Bang, Seojin
    Lee, Heewook
    BIOCOMPUTING 2023, PSB 2023, 2023, : 347 - 358
  • [3] Deep-ProBind: binding protein prediction with transformer-based deep learning model
    Khan, Salman
    Noor, Sumaiya
    Awan, Hamid Hussain
    Iqbal, Shehryar
    AlQahtani, Salman A.
    Dilshad, Naqqash
    Ahmad, Nijad
    BMC Bioinformatics, 2025, 26 (01)
  • [4] The novel graph transformer-based surrogate model for learning physical systems
    Feng, Bo
    Zhou, Xiao-Ping
    Computer Methods in Applied Mechanics and Engineering, 2024, 432
  • [5] Transformer-based power system energy prediction model
    Rao, Zhuyi
    Zhang, Yunxiang
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 913 - 917
  • [6] A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos
    Ozdel, Suleyman
    Rong, Yao
    Albaba, Berat Mert
    Kuo, Yen-Ling
    Wang, Xi
    Kasneci, Enkelejda
    PROCEEDINGS OF THE 2024 ACM SYMPOSIUM ON EYE TRACKING RESEARCH & APPLICATIONS, ETRA 2024, 2024,
  • [7] MPformer: A Transformer-Based Model for Earthen Ruins Climate Prediction
    Xu, Guodong
    Wang, Hai
    Ji, Shuo
    Ma, Yuhui
    Feng, Yi
    TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (06): : 1829 - 1838
  • [8] A Transformer-Based Model for Short-Term Landslide Displacement Prediction
    Tian Y.
    Pang X.
    Zhao W.
    Chang X.
    Cheng C.
    Zou P.
    Cao X.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2023, 59 (02): : 197 - 210
  • [9] A Transformer-Based Model for Time Series Prediction of Remote Sensing Data
    Niu, Xintian
    Liu, Yige Ng
    Ma, Ming
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT II, ICIC 2024, 2024, 14876 : 188 - 200
  • [10] TemproNet: A transformer-based deep learning model for seawater temperature prediction
    Chen, Qiaochuan
    Cai, Candong
    Chen, Yaoran
    Zhou, Xi
    Zhang, Dan
    Peng, Yan
    OCEAN ENGINEERING, 2024, 293