TLPGNN: A Lightweight Two-Level Parallelism Paradigm for Graph Neural Network Computation on GPU

被引：6

作者：

Fu, Qiang ^{[1
]}

Ji, Yuede ^{[2
]}

Huang, H. Howie ^{[1
]}

机构：

[1] George Washington Univ, Washington, DC 20052 USA

[2] Univ North Texas, Denton, TX USA

来源：

PROCEEDINGS OF THE 31ST INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2022 | 2022年

基金：

美国国家科学基金会;

关键词：

Graph Neural Networks; GPU; Performance;

D O I：

10.1145/3502181.3531467

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Graph Neural Networks (GNNs) are an emerging class of deep learning models on graphs, with many successful applications, such as, recommendation systems, drug discovery, and social network analysis. The GNN computation includes both regular neural network operations and general graph convolution operations, which take the majority of the total computation time. Though several recent works have been proposed to accelerate the computation for GNNs, they face the limitations of heavy pre-processing, low efficient atomic operations, and unnecessary kernel launches. In this paper, we design TLPGNN, a lightweight two-level parallelism paradigm for GNN computation. First, we conduct a systematic analysis on the hardware resource usage of GNN workloads to deeply understand the specialties of GNN workloads. With the insightful observations, we then divide the GNN computation into two levels, i.e., vertex parallelism for the first level and feature parallelism for the second. Next, we employ a novel hybrid dynamic workload assignment to address the imbalanced workload distribution. Furthermore, we fuse the kernels to reduce the number of kernel launches and cache the frequently accessed data into registers to avoid unnecessary memory traffics. Together, TLPGNN is able to significantly outperform existing GNN computation systems, such as DGL, GNNAdvisor, and FeatGraph, by 5.6x, 7.7x, and 3.3x, respectively, on the average.

引用

页码：122 / 134

页数：13

共 50 条

[1] TLPGNN: A Lightweight Two-level Parallelism Paradigm for Graph Neural Network Computation on Single and Multiple GPUs
Fu, Qiang
Ji, Yuede
Rolinger, Thomas B.
Huang, H. Howie
[J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2024, 11 (02)
[2] Two-Level Graph Neural Network
Ai, Xing
Sun, Chengyu
Zhang, Zhihong
Hancock, Edwin R.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4593 - 4606
[3] Efficient GPU Computation Using Task Graph Parallelism
Lin, Dian-Lun
Huang, Tsung-Wei
[J]. EURO-PAR 2021: PARALLEL PROCESSING, 2021, 12820 : 435 - 450
[4] Two-level adversarial attacks for graph neural networks
Song, Chengxi
Niu, Lingfeng
Lei, Minglong
[J]. INFORMATION SCIENCES, 2024, 654
[5] Accelerating Large Sparse Neural Network Inference Using GPU Task Graph Parallelism
Lin, Dian-Lun
Huang, Tsung-Wei
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 3041 - 3052
[6] Exploiting two-level parallelism in FEM applications
Plazek, J
Banas, K
Kitowski, J
Boryczko, K
[J]. HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1997, 1225 : 272 - 281
[7] Parallelism hardware computation for Artificial Neural Network
Marwa, G. A. M.
Mohamed, Boubaker
Najoua, Chalbi
Hedi, Bedoui Mohamed
[J]. 2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 1049 - 1055
[8] Two-level high precision CMAC neural network
Avedyan, ED
[J]. CONTROL OF OSCILLATIONS AND CHAOS - 1997 1ST INTERNATIONAL CONFERENCE, PROCEEDINGS, VOLS 1-3, 1997, : 522 - 525
[9] Two-Level Convolutional Neural Network for Aspect Extraction
Wu, Jialin
Cai, Yi
Huang, Qingbao
Xu, Jingyun
Wong, Raymond Chi-Wing
Chen, Jian
[J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2020, 2020, 12115 : 93 - 105
[10] Two-Level Parallelism to Accelerate Multiple Genome Comparisons
Torreno, Oscar
Trelles, Oswaldo
[J]. EURO-PAR 2016: PARALLEL PROCESSING WORKSHOPS, 2017, 10104 : 445 - 456

← 1 2 3 4 5 →