Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

被引：10

作者：

Choi, Jong Youl ^{[1
]}

Zhang, Pei ^{[2
]}

Mehta, Kshitij ^{[1
]}

Blanchard, Andrew ^{[2
]}

Pasini, Massimiliano Lupo ^{[2
]}

机构：

[1] Oak Ridge Natl Lab, Comp Sci & Math Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA

[2] Oak Ridge Natl Lab, Computat Sci & Engn Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA

来源：

JOURNAL OF CHEMINFORMATICS | 2022年 / 14卷 / 01期

关键词：

Graph neural networks; Distributed data parallelism; Surrogate models; Atomic modeling; Molecular dynamics; HOMO-LUMO gap;

D O I：

10.1186/s13321-022-00652-1

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing (i) reduction of data loading time up to 4.2 times compared with a conventional method and (ii) linear scaling performance for training up to 1024 GPUs on both Summit and Perlmutter.

引用

页数：10

共 50 条

[1] Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
Jong Youl Choi
Pei Zhang
Kshitij Mehta
Andrew Blanchard
Massimiliano Lupo Pasini
Journal of Cheminformatics, 14
[2] Electrostatic considerations affecting the calculated HOMO-LUMO gap in protein molecules
Lever, Greg
Cole, Daniel J.
Hine, Nicholas D. M.
Haynes, Peter D.
Payne, Mike C.
JOURNAL OF PHYSICS-CONDENSED MATTER, 2013, 25 (15)
[3] Accurate, efficient and scalable training of Graph Neural Networks
Zeng, Hanqing
Zhou, Hongkuan
Srivastava, Ajitesh
Kannan, Rajgopal
Prasanna, Viktor
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
[4] Fast and Accurate Predictions of Total Energy for Solid Solution Alloys with Graph Convolutional Neural Networks
Pasini, Massimiliano Lupo
Burcul, Marko
Reeve, Samuel Temple
Eisenbach, Markus
Perotto, Simona
DRIVING SCIENTIFIC AND ENGINEERING DISCOVERIES THROUGH THE INTEGRATION OF EXPERIMENT, BIG DATA, AND MODELING AND SIMULATION, 2022, 1512 : 79 - 98
[5] Tuning the HOMO-LUMO gap of polycyclic conjugated molecules using benzo-annelation strategy
Radenkovic, Slavko
Dordevic, Sladana
Nikolendzic, Marijana
CHEMICAL PHYSICS LETTERS, 2024, 856
[6] FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently
Cong, Zicun
Shi, Baoxu
Li, Shan
Yang, Jaewon
He, Qi
Pei, Jian
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1537 - 1551
[7] FAST-GO: Fast, Accurate, and Scalable Hardware Trojan Detection using Graph Convolutional Networks
Imangholi, Ali
Hashemi, Mona
Momeni, Amirabbas
Mohammadi, Siamak
Carlson, Trevor E.
2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
[8] FAST GRAPH CONVOLUTIONAL RECURRENT NEURAL NETWORKS
Kadambari, Sai Kiran
Chepuri, Sundeep Prabhakar
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 467 - 471
[9] Hardness and HOMO-LUMO gap probed by the helium atom pushing the molecular surface of the first-row hydride molecules
Malolepsza, E
Piela, L
COLLECTION OF CZECHOSLOVAK CHEMICAL COMMUNICATIONS, 2003, 68 (12) : 2344 - 2354
[10] Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph Propagation
Li, Xueqi
Xiao, Guoqing
Chen, Yuedan
Li, Kenli
Cong, Gao
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 7556 - 7568

← 1 2 3 4 5 →