High performance binding affinity prediction with a Transformer-based surrogate model

被引:0
|
作者
Vasan, Archit [1 ]
Gokdemir, Ozan [1 ,2 ]
Brace, Alexander [1 ,2 ]
Ramanathan, Arvind [1 ,2 ]
Brettin, Thomas [1 ]
Stevens, Rick [1 ,2 ]
Vishwanath, Venkatram [1 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Univ Chicago, Chicago, IL 60637 USA
基金
美国国家卫生研究院;
关键词
drug discovery; virtual screening; docking surrogates; high performance computing; transformers; SMILES;
D O I
10.1109/IPDPSW63119.2024.00114
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the current paradigm of drug discovery pipelines, identification of compounds that bind to a target with high affinity constitutes the first step. This is typically performed using resource-intensive experimental methods to screen vast chemical search spaces - a key bottleneck in the drug-discovery pipeline. To streamline this process, highly-scalable computational screening methods with acceptable fidelity are needed to screen larger portions of the chemical search space and identify promising candidates to he validated using experiments. Machine learning methods, namely, surrogate models have recently evolved into favorable alternatives to perform this computational screening. In this work, we present Simple SMILES Transformer (SST), an accurate and highly-scalable binding affinity prediction method that approximates the computationally-intensive molecular docking process using an encoder-only Transformer architecture. We benchmark our model against two baselines that feature fundamentally different approaches to docking surrogates: RegGO, a MORDRED fingerprint based multi-layer perceptron model, and Chemprop, a directed message-passing graph neural network. Unlike Chemprop and RegGO, our method operates solely on the SMILES representation of molecules without needing additional featurization, which leads to reduced preprocessing overhead, higher inference throughput and thus better scalability. We train SST in a distributed fashion on the Polaris supercomputer at the Argonne Leadership Computing Facility (ALCF). We then deploy it at an unprecedented scale for inference across 256 compute nodes of ALCF's Aurora supercomputer to screen 22 billion compounds in 40 minutes in search of hits with high binding affinity to oncoprotein RtcB ligase. SST predictions emphasize several molecular motifs that have previously been confirmed to interact with residues in their target binding pockets.
引用
下载
收藏
页码:571 / 580
页数:10
相关论文
共 50 条
  • [21] A transformer-based model for default prediction in mid-cap corporate markets
    Korangi, Kamesh
    Mues, Christophe
    Bravo, Cristian
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 308 (01) : 306 - 320
  • [22] CityTransformer: A Transformer-Based Model for Contaminant Dispersion Prediction in a Realistic Urban Area
    Yuuichi Asahi
    Naoyuki Onodera
    Yuta Hasegawa
    Takashi Shimokawabe
    Hayato Shiba
    Yasuhiro Idomura
    Boundary-Layer Meteorology, 2023, 186 : 659 - 692
  • [23] Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model
    Wang, Ting-Wei
    Lai, Shang-Hong
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2024, 13 (05)
  • [24] TransPTM: a transformer-based model for non-histone acetylation site prediction
    Meng, Lingkuan
    Chen, Xingjian
    Cheng, Ke
    Chen, Nanjun
    Zheng, Zetian
    Wang, Fuzhou
    Sun, Hongyan
    Wong, Ka-Chun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [25] Robust Transformer-based model for spatiotemporal PM2.5 prediction in California
    Tong, Weitian
    Limperis, Jordan
    Hamza-Lup, Felix
    Xu, Yao
    Li, Lixin
    EARTH SCIENCE INFORMATICS, 2024, 17 (01) : 315 - 328
  • [26] Comprehensive Transformer-Based Model Architecture for Real-World Storm Prediction
    Lin, Fudong
    Yuan, Xu
    Zhang, Yihe
    Sigdel, Purushottam
    Chen, Li
    Peng, Lu
    Tzeng, Nian-Feng
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VII, 2023, 14175 : 54 - 71
  • [27] An Explainable Transformer-Based Deep Learning Model for the Prediction of Incident Heart Failure
    Rao, Shishir
    Li, Yikuan
    Ramakrishnan, Rema
    Hassaine, Abdelaali
    Canoy, Dexter
    Cleland, John
    Lukasiewicz, Thomas
    Salimi-Khorshidi, Gholamreza
    Rahimi, Kazem
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (07) : 3362 - 3372
  • [28] CityTransformer: A Transformer-Based Model for Contaminant Dispersion Prediction in a Realistic Urban Area
    Asahi, Yuuichi
    Onodera, Naoyuki
    Hasegawa, Yuta
    Shimokawabe, Takashi
    Shiba, Hayato
    Idomura, Yasuhiro
    BOUNDARY-LAYER METEOROLOGY, 2023, 186 (03) : 659 - 692
  • [29] Pedestrian Crossing Intention Prediction with Multi-Modal Transformer-Based Model
    Wang, Ting Wei
    Lai, Shang-Hong
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1349 - 1356
  • [30] Traffic Transformer: Transformer-based framework for temporal traffic accident prediction
    Al-Thani, Mansoor G.
    Sheng, Ziyu
    Cao, Yuting
    Yang, Yin
    AIMS MATHEMATICS, 2024, 9 (05): : 12610 - 12629