Efficient Data Loader for Fast Sampling-Based GNN Training on Large Graphs

被引:15
|
作者
Bai, Youhui [1 ]
Li, Cheng [1 ]
Lin, Zhiqi [1 ]
Wu, Yufei [1 ]
Miao, Youshan [2 ]
Liu, Yunxin [2 ]
Xu, Yinlong [1 ,3 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230026, Anhui, Peoples R China
[2] Microsoft Res, Beijing 100080, Peoples R China
[3] Anhui Prov Key Lab High Performance Comp, Hefei 230026, Anhui, Peoples R China
基金
国家重点研发计划;
关键词
Training; Graphics processing units; Loading; Computational modeling; Load modeling; Partitioning algorithms; Deep learning; Graph neural network; cache; large graph; graph partition; pipeline; multi-GPU;
D O I
10.1109/TPDS.2021.3065737
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Emerging graph neural networks (GNNs) have extended the successes of deep learning techniques against datasets like images and texts to more complex graph-structured data. By leveraging GPU accelerators, existing frameworks combine mini-batch and sampling for effective and efficient model training on large graphs. However, this setup faces a scalability issue since loading rich vertex features from CPU to GPU through a limited bandwidth link usually dominates the training cycle. In this article, we propose PaGraph, a novel, efficient data loader that supports general and efficient sampling-based GNN training on single-server with multi-GPU. PaGraph significantly reduces the data loading time by exploiting available GPU resources to keep frequently-accessed graph data with a cache. It also embodies a lightweight yet effective caching policy that takes into account graph structural information and data access patterns of sampling-based GNN training simultaneously. Furthermore, to scale out on multiple GPUs, PaGraph develops a fast GNN-computation-aware partition algorithm to avoid cross-partition access during data-parallel training and achieves better cache efficiency. Finally, it overlaps data loading and GNN computation for further hiding loading costs. Evaluations on two representative GNN models, GCN and GraphSAGE, using two sampling methods, Neighbor and Layer-wise, show that PaGraph could eliminate the data loading time from the GNN training pipeline, and achieve up to 4.8x performance speedup over the state-of-the-art baselines. Together with preprocessing optimization, PaGraph further delivers up to 16.0x end-to-end speedup.
引用
收藏
页码:2541 / 2556
页数:16
相关论文
共 50 条
  • [1] TurboGNN: Improving the End-to-End Performance for Sampling-Based GNN Training on GPUs
    Wu, Wenchao
    Shi, Xuanhua
    He, Ligang
    Jin, Hai
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (09) : 2571 - 2584
  • [2] Efficient Sampling-based ADMM for Distributed Data
    Wang, Jun-Kun
    Lin, Shou-De
    [J]. PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 321 - 330
  • [3] A sampling-based approach for efficient clustering in large datasets
    Exarchakis, Georgios
    Oubari, Omar
    Lenz, Gregor
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12393 - 12402
  • [4] Feature-Oriented Sampling for Fast and Scalable GNN Training
    Zhang, Xin
    Shen, Yanyan
    Chen, Lei
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 723 - 732
  • [5] Efficient grid deformation using deterministic sampling-based data reduction
    Cho, Haeseong
    Kim, Haedong
    Shin, SangJoon
    [J]. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2020, 121 (18) : 4028 - 4049
  • [6] SAMPLING LARGE DATA ON GRAPHS
    Shomorony, Ilan
    Avestimehr, A. Salman
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 933 - 936
  • [7] ParticleAugment: Sampling-based data augmentation
    Tsaregorodtsev, Alexander
    Belagiannis, Vasileios
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 228
  • [8] Accelerating GNN Training by Adapting Large Graphs to Distributed Heterogeneous Architectures
    Zhang, Lizhi
    Lu, Kai
    Lai, Zhiquan
    Fu, Yongquan
    Tang, Yu
    Li, Dongsheng
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (12) : 3473 - 3488
  • [9] DGS: Communication-Efficient Graph Sampling for Distributed GNN Training
    Wan, Xinchen
    Chen, Kai
    Zhang, Yiming
    [J]. 2022 IEEE 30TH INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS (ICNP 2022), 2022,
  • [10] Efficient Sampling-based Multirotors Kinodynamic Planning with Fast Regional Optimization and Post Refining
    Ye, Hongkai
    Pan, Neng
    Wang, Qianhao
    Xu, Chao
    Gao, Fei
    [J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 3356 - 3363