BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

被引:0
|
作者
Liu, Tianfeng [1 ,3 ,4 ]
Chen, Yangrui [2 ,3 ]
Li, Dan [1 ,4 ]
Wu, Chuan [2 ]
Zhu, Yibo [3 ]
He, Jun [3 ]
Peng, Yanghua [3 ]
Chen, Hongzheng [3 ,5 ]
Chen, Hongzhi [3 ]
Guo, Chuanxiong [3 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Univ Hong Kong, Hong Kong, Peoples R China
[3] ByteDance, Beijing, Peoples R China
[4] Zhongguancun Lab, Beijing, Peoples R China
[5] Cornell Univ, Ithaca, NY USA
基金
中国国家自然科学基金;
关键词
SYSTEM;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data, achieving ground-breaking performance on various tasks such as node classification and graph property prediction. Nonetheless, existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs. The main bottlenecks are the process of preparing data for GPUs - subgraph sampling and feature retrieving. This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas. First, we propose a dynamic cache engine to minimize feature retrieving traffic. By co-designing caching policy and the order of sampling, we find a sweet spot of low overhead and a high cache hit ratio. Second, we improve the graph partition algorithm to reduce cross-partition communication during subgraph sampling. Finally, careful resource isolation reduces contention between different data preprocessing stages. Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 1.9x on average.
引用
收藏
页码:103 / 118
页数:16
相关论文
共 50 条
  • [11] I/O Efficient ECC Graph Decomposition via Graph Reduction
    Yuan, Long
    Qin, Lu
    Lin, Xuemin
    Chang, Lijun
    Zhang, Wenjie
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (07): : 516 - 527
  • [12] I/O efficient ECC graph decomposition via graph reduction
    Long Yuan
    Lu Qin
    Xuemin Lin
    Lijun Chang
    Wenjie Zhang
    The VLDB Journal, 2017, 26 : 275 - 300
  • [13] Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
    Min, Seung Won
    Wu, Kun
    Huang, Sitao
    Hidayetoglu, Mert
    Xiong, Jinjun
    Ebrahimi, Eiman
    Chen, Deming
    Hwu, Wen-mei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (11): : 2087 - 2100
  • [14] GNNDrive: Reducing Memory Contention and I/O Congestion for Disk-based GNN Training
    Jiang, Qisheng
    Jia, Lei
    Wang, Chundong
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 650 - 659
  • [15] POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters
    Li, Shunde
    Gu, Junyu
    Wang, Jue
    Yao, Tiechui
    Liang, Zhiqiang
    Shi, Yumeng
    Li, Shigang
    Xi, Weiting
    Li, Shushen
    Zhou, Chunbao
    Wang, Yangang
    Chi, Xuebin
    PROCEEDINGS OF THE 29TH ACM SIGPLAN ANNUAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, PPOPP 2024, 2024, : 469 - 471
  • [16] TT-GNN: Efficient On-Chip Graph Neural Network Training via Embedding Reformation and Hardware Optimization
    Qu, Zheng
    Niu, Dimin
    Li, Shuangchen
    Zheng, Hongzhong
    Xie, Yuan
    56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023, 2023, : 452 - 464
  • [17] I/O Efficient Core Graph Decomposition at Web Scale
    Wen, Dong
    Qin, Lu
    Zhang, Ying
    Lin, Xuemin
    Yu, Jeffrey Xu
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 133 - 144
  • [18] HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics
    Li, Jing
    Tseng, Hung-Wei
    Lin, Chunbin
    Papakonstantinou, Yannis
    Swanson, Steven
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (14): : 1647 - 1658
  • [19] Transparent I/O-Aware GPU Virtualization for Efficient Resource Consolidation
    Gonzalez, Nelson Mimura
    Elengikal, Tonia
    2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 131 - 140
  • [20] Optimizing data acquisition: a Bayesian approach for efficient machine learning model training
    Mahani, M. R.
    Nechepurenko, Igor A.
    Rahimof, Yasmin
    Wicht, Andreas
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (03):