Partitioning Sparse Deep Neural Networks for Scalable Training and Inference

被引：6

作者：

Demirci, Gunduz Vehbi ^{[1
]}

Ferhatosmanoglu, Hakan ^{[1
]}

机构：

[1] Univ Warwick, Coventry, W Midlands, England

来源：

PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021 | 2021年

关键词：

Scalable Deep Learning; Sparse Deep Neural Networks; Distributed Stochastic Gradient Descent; Hypergraph Partitioning; Sparse Matrix Vector Multiplication;

D O I：

10.1145/3447818.3460372

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning. Both the feed-forward (inference) and backpropagation steps in stochastic gradient descent (SGD) algorithm for training sparse DNNs involve consecutive sparse matrix-vector multiplications (SpMVs). We first introduce a distributed-memory parallel SpMV-based solution for the SGD algorithm to improve its scalability. The parallelization approach is based on row-wise partitioning of weight matrices that represent neuron connections between consecutive layers. We then propose a novel hypergraph model for partitioning weight matrices to reduce the total communication volume and ensure computational load-balance among processors. Experiments performed on sparse DNNs demonstrate that the proposed solution is highly efficient and scalable. By utilizing the proposed matrix partitioning scheme, the performance of our solution is further improved significantly.

引用

页码：254 / 265

页数：12

共 50 条

[1] Performance of Training Sparse Deep Neural Networks on GPUs
Wang, Jianzong
Huang, Zhangcheng
Kong, Lingwei
Xiao, Jing
Wang, Pengyu
Zhang, Lu
Li, Chao
[J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[2] Efficient Priors for Scalable Variational Inference in Bayesian Deep Neural Networks
Krishnan, Ranganath
Subedar, Mahesh
Tickoo, Omesh
Labs, Intel
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 773 - 777
[3] Training Sparse Neural Networks
Srinivas, Suraj
Subramanya, Akshayvarun
Babu, R. Venkatesh
[J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 455 - 462
[4] A Scalable GPU-enabled Framework for Training Deep Neural Networks
Del Monte, Bonaventura
Prodan, Radu
[J]. 2016 2ND INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2016,
[5] Scalable bio-inspired training of Deep Neural Networks with FastHebb
Lagani, Gabriele
Falchi, Fabrizio
Gennaro, Claudio
Fassold, Hannes
Amato, Giuseppe
[J]. NEUROCOMPUTING, 2024, 595
[6] Sparse Bayesian Neural Networks: Bridging Model and Parameter Uncertainty through Scalable Variational Inference
Hubin, Aliaksandr
Storvik, Geir
[J]. MATHEMATICS, 2024, 12 (06)
[7] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
Li, Shengwei
Lai, Zhiquan
Li, Dongsheng
Zhang, Yiming
Ye, Xiangyu
Duan, Yabo
[J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
[8] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
Dey, Sourya
Shao, Yinan
Chugg, Keith M.
Beerel, Peter A.
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280
[9] Computational memory-based inference and training of deep neural networks
Sebastian, A.
Boybat, I.
Dazzi, M.
Giannopoulos, I.
Jonnalagadda, V.
Joshi, V.
Karunaratne, G.
Kersting, B.
Khaddam-Aljameh, R.
Nandakumar, S. R.
Petropoulos, A.
Piveteau, C.
Antonakopoulos, T.
Rajendran, B.
Le Gallo, M.
Eleftheriou, E.
[J]. 2019 SYMPOSIUM ON VLSI CIRCUITS, 2019, : T168 - T169
[10] BestOf: an online implementation selector for the training and inference of deep neural networks
Barrachina, Sergio
Castello, Adrian
Dolz, Manuel F.
Tomas, Andres E.
[J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (16): : 17543 - 17558

← 1 2 3 4 5 →