Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引:0
|
作者
Shahhosseini, Sina [1 ]
Albaqsami, Ahmad [1 ]
Jasemi, Masoomeh [1 ,2 ]
Bagherzadeh, Nader [1 ]
机构
[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;
D O I
10.1109/PDP50117.2020.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.
引用
下载
收藏
页码:307 / 311
页数:5
相关论文
共 50 条
  • [21] GROWING AND PRUNING NEURAL TREE NETWORKS
    SANKAR, A
    MAMMONE, RJ
    IEEE TRANSACTIONS ON COMPUTERS, 1993, 42 (03) : 291 - 299
  • [22] PRUNING VERSUS CLIPPING IN NEURAL NETWORKS
    JANOWSKY, SA
    PHYSICAL REVIEW A, 1989, 39 (12): : 6600 - 6603
  • [23] Multi-objective pruning of dense neural networks using deep reinforcement learning
    Hirsch, Lior
    Katz, Gilad
    INFORMATION SCIENCES, 2022, 610 : 381 - 400
  • [24] Hessian-Aware Pruning and Optimal Neural Implant
    Yu, Shixing
    Yao, Zhewei
    Gholami, Amir
    Dong, Zhen
    Kim, Sehoon
    Mahoney, Michael W.
    Keutzer, Kurt
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3665 - 3676
  • [25] Pruning-aware Sparse Regularization for Network Pruning
    Nan-Fei Jiang
    Xu Zhao
    Chao-Yang Zhao
    Yong-Qi An
    Ming Tang
    Jin-Qiao Wang
    Machine Intelligence Research, 2023, 20 : 109 - 120
  • [26] Inference-aware convolutional neural network pruning
    Choudhary, Tejalal
    Mishra, Vipul
    Goswami, Anurag
    Sarangapani, Jagannathan
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 135 : 44 - 56
  • [27] Pruning-aware Sparse Regularization for Network Pruning
    Nan-Fei Jiang
    Xu Zhao
    Chao-Yang Zhao
    Yong-Qi An
    Ming Tang
    Jin-Qiao Wang
    Machine Intelligence Research, 2023, 20 (01) : 109 - 120
  • [28] Pruning-aware Sparse Regularization for Network Pruning
    Jiang, Nan-Fei
    Zhao, Xu
    Zhao, Chao-Yang
    An, Yong-Qi
    Tang, Ming
    Wang, Jin-Qiao
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (01) : 109 - 120
  • [29] HFP: Hardware-Aware Filter Pruning for Deep Convolutional Neural Networks Acceleration
    Yu, Fang
    Han, Chuanqi
    Wang, Pengcheng
    Huang, Ruoran
    Huang, Xi
    Cui, Li
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 255 - 262
  • [30] ROBUSTNESS-AWARE FILTER PRUNING FOR ROBUST NEURAL NETWORKS AGAINST ADVERSARIAL ATTACKS
    Lim, Hyuntak
    Roh, Si-Dong
    Park, Sangki
    Chung, Ki-Seok
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,