Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引:0
|
作者
Shahhosseini, Sina [1 ]
Albaqsami, Ahmad [1 ]
Jasemi, Masoomeh [1 ,2 ]
Bagherzadeh, Nader [1 ]
机构
[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;
D O I
10.1109/PDP50117.2020.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.
引用
下载
收藏
页码:307 / 311
页数:5
相关论文
共 50 条
  • [31] FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
    Ramhorst, Benjamin
    Loncar, Vladimir
    Constantinides, George A.
    2023 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, ICFPT, 2023, : 282 - 283
  • [32] CHAMP: Coherent Hardware-Aware Magnitude Pruning of Integrated Photonic Neural Networks
    Banerjee, Sanmitra
    Nikdast, Mahdi
    Pasricha, Sudeep
    Chakrabarty, Krishnendu
    2022 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION (OFC), 2022,
  • [33] A GPU Architecture Aware Fine-Grain Pruning Technique for Deep Neural Networks
    Choi, Kyusik
    Yang, Hoeseok
    EURO-PAR 2021: PARALLEL PROCESSING, 2021, 12820 : 217 - 231
  • [34] Resource-Aware Saliency-Guided Differentiable Pruning for Deep Neural Networks
    Kallakuri, Uttej
    Humes, Edward
    Mohsenin, Tinoosh
    PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 694 - 699
  • [35] FRACTIONAL STEP DISCRIMINANT PRUNING: A FILTER PRUNING FRAMEWORK FOR DEEP CONVOLUTIONAL NEURAL NETWORKS
    Gkalelis, Nikolaos
    Mezaris, Vasileios
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2020,
  • [36] Iterative clustering pruning for convolutional neural networks
    Chang, Jingfei
    Lu, Yang
    Xue, Ping
    Xu, Yiqun
    Wei, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2023, 265
  • [37] Leveraging Structured Pruning of Convolutional Neural Networks
    Tessier, Hugo
    Gripon, Vincent
    Leonardon, Mathieu
    Arzel, Matthieu
    Bertrand, David
    Hannagan, Thomas
    2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2022, : 174 - 179
  • [38] On rule pruning using fuzzy neural networks
    Department of Computer Science, Regional Engineering College, Durgapur, W.B., India
    Fuzzy Sets Syst, 3 (335-347):
  • [39] Magnitude and Uncertainty Pruning Criterion for Neural Networks
    Ko, Vinnie
    Oehmcke, Stefan
    Gieseke, Fabian
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2317 - 2326
  • [40] Flattening Layer Pruning in Convolutional Neural Networks
    Jeczmionek, Ernest
    Kowalski, Piotr A.
    SYMMETRY-BASEL, 2021, 13 (07):