Accelerating Distributed Inference of Sparse Deep Neural Networks via Mitigating the Straggler Effect

被引：0

作者：

Mofrad, Mohammad Hasanzadeh ^{[1
]}

Melhem, Rami ^{[1
]}

Ahmad, Yousuf ^{[2
]}

Hammoud, Mohammad ^{[2
]}

机构：

[1] Univ Pittsburgh, Pittsburgh, PA 15260 USA

[2] Carnegie Mellon Univ Qatar, Doha, Qatar

来源：

2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC) | 2020年

关键词：

Sparse DNN; Sparse matrix-matrix multiplication; DNN parallelism; data parallelism; model parallelism; DESIGN;

D O I：

10.1109/hpec43674.2020.9286189

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Once a Deep Neural Network (DNN) is trained, an inference algorithm retains the learning and applies it to batches of data. The trained DNN can be sparse because of pruning or following a preset sparse connectivity pattern. Inference in such sparse networks requires less space and time complexities compared to dense ones. Similar to dense DNNs, sparse DNNs can be parallelized using model or data parallelism, whereby the former partitions the network and the latter partitions the input among multiple threads. Model parallelism efficiently utilizes the Last Level Cache (LLC) but has a heavy synchronization cost because of compulsory reductions per layer. In contrast, data parallelism allows independent execution of partitions but suffers from a straggler effect due to a load imbalance among partitions. We combine data and model parallelisms through a new type of parallelism that we denote data-then-model. In data-then-model, each thread starts with data parallelism, thus mitigating the per-layer synchronization cost of model parallelism. After it finishes its partition, it switches to model parallelism to support a slower active thread, hence, alleviating the straggler effect of data parallelism. We compare data-then-model parallelism with data and model parallelisms as well as task-based parallelisms using the IEEE HPEC sparse DNN challenge dataset. On average, we achieve up to 10 to 65% speedup compared to these parallelisms.

引用

页数：7

共 50 条

[1] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
Li, Shengwei
Lai, Zhiquan
Li, Dongsheng
Zhang, Yiming
Ye, Xiangyu
Duan, Yabo
[J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
[2] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
Dey, Sourya
Shao, Yinan
Chugg, Keith M.
Beerel, Peter A.
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280
[3] Accelerating Sparse Deep Neural Networks on FPGAs
Huang, Sitao
Pearson, Carl
Nagi, Rakesh
Xiong, Jinjun
Chen, Deming
Hwu, Wen-mei
[J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[4] Efficient Distributed Inference of Deep Neural Networks via Restructuring and Pruning
Abdi, Afshin
Rashidi, Saeed
Fekri, Faramarz
Krishna, Tushar
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6640 - 6648
[5] SNICIT: Accelerating Sparse Neural Network Inference via Compression at Inference Time on GPU
Jiang, Shui
Huang, Tsung-Wei
Yu, Bei
Ho, Tsung-Yi
[J]. PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 51 - 61
[6] Accelerating Training for Distributed Deep Neural Networks in MapReduce
Xu, Jie
Wang, Jingyu
Qi, Qi
Sun, Haifeng
Liao, Jianxin
[J]. WEB SERVICES - ICWS 2018, 2018, 10966 : 181 - 195
[7] Accelerating Sparse Deep Neural Network Inference Using GPU Tensor Cores
Sun, Yufei
Zheng, Long
Wang, Qinggang
Ye, Xiangyu
Huang, Yu
Yao, Pengcheng
Liao, Xiaofei
Jin, Hai
[J]. 2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
[8] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
Demirci, Gunduz Vehbi
Ferhatosmanoglu, Hakan
[J]. PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
[9] Straggler-Aware In-Network Aggregation for Accelerating Distributed Deep Learning
Lee, Hochan
Lee, Jaewook
Kim, Heewon
Pack, Sangheon
[J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (06) : 4198 - 4204
[10] Distributed Learning of Deep Sparse Neural Networks for High-dimensional Classification
Garg, Shweta
Krishnan, R.
Jagannathan, S.
Samaranayake, V. A.
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1587 - 1592

← 1 2 3 4 5 →