Partitioning Sparse Deep Neural Networks for Scalable Training and Inference

被引:6
|
作者
Demirci, Gunduz Vehbi [1 ]
Ferhatosmanoglu, Hakan [1 ]
机构
[1] Univ Warwick, Coventry, W Midlands, England
关键词
Scalable Deep Learning; Sparse Deep Neural Networks; Distributed Stochastic Gradient Descent; Hypergraph Partitioning; Sparse Matrix Vector Multiplication;
D O I
10.1145/3447818.3460372
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning. Both the feed-forward (inference) and backpropagation steps in stochastic gradient descent (SGD) algorithm for training sparse DNNs involve consecutive sparse matrix-vector multiplications (SpMVs). We first introduce a distributed-memory parallel SpMV-based solution for the SGD algorithm to improve its scalability. The parallelization approach is based on row-wise partitioning of weight matrices that represent neuron connections between consecutive layers. We then propose a novel hypergraph model for partitioning weight matrices to reduce the total communication volume and ensure computational load-balance among processors. Experiments performed on sparse DNNs demonstrate that the proposed solution is highly efficient and scalable. By utilizing the proposed matrix partitioning scheme, the performance of our solution is further improved significantly.
引用
收藏
页码:254 / 265
页数:12
相关论文
共 50 条
  • [1] Performance of Training Sparse Deep Neural Networks on GPUs
    Wang, Jianzong
    Huang, Zhangcheng
    Kong, Lingwei
    Xiao, Jing
    Wang, Pengyu
    Zhang, Lu
    Li, Chao
    [J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [2] Efficient Priors for Scalable Variational Inference in Bayesian Deep Neural Networks
    Krishnan, Ranganath
    Subedar, Mahesh
    Tickoo, Omesh
    Labs, Intel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 773 - 777
  • [3] Training Sparse Neural Networks
    Srinivas, Suraj
    Subramanya, Akshayvarun
    Babu, R. Venkatesh
    [J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 455 - 462
  • [4] A Scalable GPU-enabled Framework for Training Deep Neural Networks
    Del Monte, Bonaventura
    Prodan, Radu
    [J]. 2016 2ND INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2016,
  • [5] Scalable bio-inspired training of Deep Neural Networks with FastHebb
    Lagani, Gabriele
    Falchi, Fabrizio
    Gennaro, Claudio
    Fassold, Hannes
    Amato, Giuseppe
    [J]. NEUROCOMPUTING, 2024, 595
  • [6] Sparse Bayesian Neural Networks: Bridging Model and Parameter Uncertainty through Scalable Variational Inference
    Hubin, Aliaksandr
    Storvik, Geir
    [J]. MATHEMATICS, 2024, 12 (06)
  • [7] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
    Li, Shengwei
    Lai, Zhiquan
    Li, Dongsheng
    Zhang, Yiming
    Ye, Xiangyu
    Duan, Yabo
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [8] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
    Dey, Sourya
    Shao, Yinan
    Chugg, Keith M.
    Beerel, Peter A.
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280
  • [9] Computational memory-based inference and training of deep neural networks
    Sebastian, A.
    Boybat, I.
    Dazzi, M.
    Giannopoulos, I.
    Jonnalagadda, V.
    Joshi, V.
    Karunaratne, G.
    Kersting, B.
    Khaddam-Aljameh, R.
    Nandakumar, S. R.
    Petropoulos, A.
    Piveteau, C.
    Antonakopoulos, T.
    Rajendran, B.
    Le Gallo, M.
    Eleftheriou, E.
    [J]. 2019 SYMPOSIUM ON VLSI CIRCUITS, 2019, : T168 - T169
  • [10] BestOf: an online implementation selector for the training and inference of deep neural networks
    Barrachina, Sergio
    Castello, Adrian
    Dolz, Manuel F.
    Tomas, Andres E.
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (16): : 17543 - 17558