Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

被引:10
|
作者
Choi, Jong Youl [1 ]
Zhang, Pei [2 ]
Mehta, Kshitij [1 ]
Blanchard, Andrew [2 ]
Pasini, Massimiliano Lupo [2 ]
机构
[1] Oak Ridge Natl Lab, Comp Sci & Math Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA
[2] Oak Ridge Natl Lab, Computat Sci & Engn Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA
关键词
Graph neural networks; Distributed data parallelism; Surrogate models; Atomic modeling; Molecular dynamics; HOMO-LUMO gap;
D O I
10.1186/s13321-022-00652-1
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing (i) reduction of data loading time up to 4.2 times compared with a conventional method and (ii) linear scaling performance for training up to 1024 GPUs on both Summit and Perlmutter.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Graph neural networks on factor graphs for robust, fast, and scalable linear state estimation with PMUs
    Kundacina, Ognjen
    Cosovic, Mirsad
    Miskovic, Dragisa
    Vukobratovic, Dejan
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2023, 34
  • [42] Accurate and rapid predictions with explainable graph neural networks for small high-fidelity bandgap datasets
    Xiao, Jianping
    Yang, Li
    Wang, Shuqun
    MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING, 2024, 32 (03)
  • [43] Leveraging Persistent Homology Features for Accurate Defect Formation Energy Predictions via Graph Neural Networks
    Fang, Zhenyao
    Yan, Qimin
    CHEMISTRY OF MATERIALS, 2025, 37 (04) : 1531 - 1540
  • [44] Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks
    Pablo-Garcia, Sergio
    Morandi, Santiago
    Vargas-Hernandez, Rodrigo A.
    Jorner, Kjell
    Ivkovic, Zarko
    Lopez, Nuria
    Aspuru-Guzik, Alan
    NATURE COMPUTATIONAL SCIENCE, 2023, 3 (05): : 433 - 442
  • [45] Equivariant graph neural networks for fast electron density estimation of molecules, liquids, and solids
    Jorgensen, Peter Bjorn
    Bhowmik, Arghya
    NPJ COMPUTATIONAL MATERIALS, 2022, 8 (01)
  • [46] Equivariant graph neural networks for fast electron density estimation of molecules, liquids, and solids
    Peter Bjørn Jørgensen
    Arghya Bhowmik
    npj Computational Materials, 8
  • [47] Fast evaluation of the adsorption energy of organic molecules on metals via graph neural networks
    Sergio Pablo-García
    Santiago Morandi
    Rodrigo A. Vargas-Hernández
    Kjell Jorner
    Žarko Ivković
    Núria López
    Alán Aspuru-Guzik
    Nature Computational Science, 2023, 3 (5): : 433 - 442
  • [48] Fast and Accurate Algorithm for ECG Authentication Using Residual Depthwise Separable Convolutional Neural Networks
    Ihsanto, Eko
    Ramli, Kalamullah
    Sudiana, Dodi
    Gunawan, Teddy Surya
    APPLIED SCIENCES-BASEL, 2020, 10 (09):
  • [49] Designing Very Fast and Accurate Convolutional Neural Networks With Application in ICD and Smart Electrocardiograph Devices
    Keyanfar, Alireza
    Ghaderi, Reza
    Nazari, Soheila
    Hajimoradi, Behzad
    Kamalzadeh, Leila
    IEEE ACCESS, 2023, 11 : 5502 - 5516
  • [50] Derivative-based pre-training of graph neural networks for materials property predictions
    Jia, Shuyi
    Parthasarathy, Akaash R.
    Feng, Rui
    Cong, Guojing
    Zhang, Chao
    Fung, Victor
    DIGITAL DISCOVERY, 2024, 3 (03): : 586 - 593