Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引:0
|
作者
Shahhosseini, Sina [1 ]
Albaqsami, Ahmad [1 ]
Jasemi, Masoomeh [1 ,2 ]
Bagherzadeh, Nader [1 ]
机构
[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;
D O I
10.1109/PDP50117.2020.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.
引用
下载
收藏
页码:307 / 311
页数:5
相关论文
共 50 条
  • [41] Activation-Based Pruning of Neural Networks
    Ganguli, Tushar
    Chong, Edwin K. P.
    Werner, Frank
    ALGORITHMS, 2024, 17 (01)
  • [42] Sparse optimization guided pruning for neural networks
    Shi, Yong
    Tang, Anda
    Niu, Lingfeng
    Zhou, Ruizhi
    NEUROCOMPUTING, 2024, 574
  • [43] DyPrune: Dynamic Pruning Rates for Neural Networks
    Aires Jonker, Richard Adolph
    Poudel, Roshan
    Fajarda, Olga
    Oliveira, Jose Luis
    Lopes, Rui Pedro
    Matos, Sergio
    PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I, 2023, 14115 : 146 - 157
  • [44] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [45] An iterative pruning algorithm for feedforward neural networks
    Castellano, G
    Fanelli, AM
    Pelillo, M
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (03): : 519 - 531
  • [46] On rule pruning using fuzzy neural networks
    Pal, NR
    Pal, T
    FUZZY SETS AND SYSTEMS, 1999, 106 (03) : 335 - 347
  • [47] Structured pruning of neural networks for constraints learning
    Cacciola, Matteo
    Frangioni, Antonio
    Lodi, Andrea
    Operations Research Letters, 2024, 57
  • [48] Evolving Better Initializations For Neural Networks With Pruning
    Zhou, Ryan
    Hu, Ting
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 703 - 706
  • [49] Automated Pruning of Neural Networks for Mobile Applications
    Glinserer, Andreas
    Lechner, Martin
    Wendt, Alexander
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,
  • [50] Online training and pruning of photonic neural networks
    Zhang, Weipeng
    Xu, Tengji
    Zhang, Jiawei
    Shastri, Bhavin J.
    Huang, Chaoran
    Prucnal, Paul
    2023 IEEE PHOTONICS CONFERENCE, IPC, 2023,