Accelerating Distributed Inference of Sparse Deep Neural Networks via Mitigating the Straggler Effect

被引:0
|
作者
Mofrad, Mohammad Hasanzadeh [1 ]
Melhem, Rami [1 ]
Ahmad, Yousuf [2 ]
Hammoud, Mohammad [2 ]
机构
[1] Univ Pittsburgh, Pittsburgh, PA 15260 USA
[2] Carnegie Mellon Univ Qatar, Doha, Qatar
关键词
Sparse DNN; Sparse matrix-matrix multiplication; DNN parallelism; data parallelism; model parallelism; DESIGN;
D O I
10.1109/hpec43674.2020.9286189
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Once a Deep Neural Network (DNN) is trained, an inference algorithm retains the learning and applies it to batches of data. The trained DNN can be sparse because of pruning or following a preset sparse connectivity pattern. Inference in such sparse networks requires less space and time complexities compared to dense ones. Similar to dense DNNs, sparse DNNs can be parallelized using model or data parallelism, whereby the former partitions the network and the latter partitions the input among multiple threads. Model parallelism efficiently utilizes the Last Level Cache (LLC) but has a heavy synchronization cost because of compulsory reductions per layer. In contrast, data parallelism allows independent execution of partitions but suffers from a straggler effect due to a load imbalance among partitions. We combine data and model parallelisms through a new type of parallelism that we denote data-then-model. In data-then-model, each thread starts with data parallelism, thus mitigating the per-layer synchronization cost of model parallelism. After it finishes its partition, it switches to model parallelism to support a slower active thread, hence, alleviating the straggler effect of data parallelism. We compare data-then-model parallelism with data and model parallelisms as well as task-based parallelisms using the IEEE HPEC sparse DNN challenge dataset. On average, we achieve up to 10 to 65% speedup compared to these parallelisms.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
    Li, Shengwei
    Lai, Zhiquan
    Li, Dongsheng
    Zhang, Yiming
    Ye, Xiangyu
    Duan, Yabo
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [2] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
    Dey, Sourya
    Shao, Yinan
    Chugg, Keith M.
    Beerel, Peter A.
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280
  • [3] Accelerating Sparse Deep Neural Networks on FPGAs
    Huang, Sitao
    Pearson, Carl
    Nagi, Rakesh
    Xiong, Jinjun
    Chen, Deming
    Hwu, Wen-mei
    [J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [4] Efficient Distributed Inference of Deep Neural Networks via Restructuring and Pruning
    Abdi, Afshin
    Rashidi, Saeed
    Fekri, Faramarz
    Krishna, Tushar
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6640 - 6648
  • [5] SNICIT: Accelerating Sparse Neural Network Inference via Compression at Inference Time on GPU
    Jiang, Shui
    Huang, Tsung-Wei
    Yu, Bei
    Ho, Tsung-Yi
    [J]. PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 51 - 61
  • [6] Accelerating Training for Distributed Deep Neural Networks in MapReduce
    Xu, Jie
    Wang, Jingyu
    Qi, Qi
    Sun, Haifeng
    Liao, Jianxin
    [J]. WEB SERVICES - ICWS 2018, 2018, 10966 : 181 - 195
  • [7] Accelerating Sparse Deep Neural Network Inference Using GPU Tensor Cores
    Sun, Yufei
    Zheng, Long
    Wang, Qinggang
    Ye, Xiangyu
    Huang, Yu
    Yao, Pengcheng
    Liao, Xiaofei
    Jin, Hai
    [J]. 2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
  • [8] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    [J]. PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
  • [9] Straggler-Aware In-Network Aggregation for Accelerating Distributed Deep Learning
    Lee, Hochan
    Lee, Jaewook
    Kim, Heewon
    Pack, Sangheon
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (06) : 4198 - 4204
  • [10] Distributed Learning of Deep Sparse Neural Networks for High-dimensional Classification
    Garg, Shweta
    Krishnan, R.
    Jagannathan, S.
    Samaranayake, V. A.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1587 - 1592