Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

被引:10
|
作者
Choi, Jong Youl [1 ]
Zhang, Pei [2 ]
Mehta, Kshitij [1 ]
Blanchard, Andrew [2 ]
Pasini, Massimiliano Lupo [2 ]
机构
[1] Oak Ridge Natl Lab, Comp Sci & Math Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA
[2] Oak Ridge Natl Lab, Computat Sci & Engn Div, 1 Bethel Valley Rd, Oak Ridge, TN 37831 USA
关键词
Graph neural networks; Distributed data parallelism; Surrogate models; Atomic modeling; Molecular dynamics; HOMO-LUMO gap;
D O I
10.1186/s13321-022-00652-1
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing (i) reduction of data loading time up to 4.2 times compared with a conventional method and (ii) linear scaling performance for training up to 1024 GPUs on both Summit and Perlmutter.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
    Jong Youl Choi
    Pei Zhang
    Kshitij Mehta
    Andrew Blanchard
    Massimiliano Lupo Pasini
    Journal of Cheminformatics, 14
  • [2] Electrostatic considerations affecting the calculated HOMO-LUMO gap in protein molecules
    Lever, Greg
    Cole, Daniel J.
    Hine, Nicholas D. M.
    Haynes, Peter D.
    Payne, Mike C.
    JOURNAL OF PHYSICS-CONDENSED MATTER, 2013, 25 (15)
  • [3] Accurate, efficient and scalable training of Graph Neural Networks
    Zeng, Hanqing
    Zhou, Hongkuan
    Srivastava, Ajitesh
    Kannan, Rajgopal
    Prasanna, Viktor
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
  • [4] Fast and Accurate Predictions of Total Energy for Solid Solution Alloys with Graph Convolutional Neural Networks
    Pasini, Massimiliano Lupo
    Burcul, Marko
    Reeve, Samuel Temple
    Eisenbach, Markus
    Perotto, Simona
    DRIVING SCIENTIFIC AND ENGINEERING DISCOVERIES THROUGH THE INTEGRATION OF EXPERIMENT, BIG DATA, AND MODELING AND SIMULATION, 2022, 1512 : 79 - 98
  • [5] Tuning the HOMO-LUMO gap of polycyclic conjugated molecules using benzo-annelation strategy
    Radenkovic, Slavko
    Dordevic, Sladana
    Nikolendzic, Marijana
    CHEMICAL PHYSICS LETTERS, 2024, 856
  • [6] FairSample: Training Fair and Accurate Graph Convolutional Neural Networks Efficiently
    Cong, Zicun
    Shi, Baoxu
    Li, Shan
    Yang, Jaewon
    He, Qi
    Pei, Jian
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1537 - 1551
  • [7] FAST-GO: Fast, Accurate, and Scalable Hardware Trojan Detection using Graph Convolutional Networks
    Imangholi, Ali
    Hashemi, Mona
    Momeni, Amirabbas
    Mohammadi, Siamak
    Carlson, Trevor E.
    2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [8] FAST GRAPH CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Kadambari, Sai Kiran
    Chepuri, Sundeep Prabhakar
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 467 - 471
  • [9] Hardness and HOMO-LUMO gap probed by the helium atom pushing the molecular surface of the first-row hydride molecules
    Malolepsza, E
    Piela, L
    COLLECTION OF CZECHOSLOVAK CHEMICAL COMMUNICATIONS, 2003, 68 (12) : 2344 - 2354
  • [10] Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph Propagation
    Li, Xueqi
    Xiao, Guoqing
    Chen, Yuedan
    Li, Kenli
    Cong, Gao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 7556 - 7568