Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引：0

作者：

Shahhosseini, Sina ^{[1
]}

Albaqsami, Ahmad ^{[1
]}

Jasemi, Masoomeh ^{[1
,2
]}

Bagherzadeh, Nader ^{[1
]}

机构：

[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA

[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran

来源：

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020) | 2020年

关键词：

Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;

D O I：

10.1109/PDP50117.2020.00053

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.

引用

下载

页码：307 / 311

页数：5

共 50 条

[41] Activation-Based Pruning of Neural Networks
Ganguli, Tushar
Chong, Edwin K. P.
Werner, Frank
ALGORITHMS, 2024, 17 (01)
[42] Sparse optimization guided pruning for neural networks
Shi, Yong
Tang, Anda
Niu, Lingfeng
Zhou, Ruizhi
NEUROCOMPUTING, 2024, 574
[43] DyPrune: Dynamic Pruning Rates for Neural Networks
Aires Jonker, Richard Adolph
Poudel, Roshan
Fajarda, Olga
Oliveira, Jose Luis
Lopes, Rui Pedro
Matos, Sergio
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I, 2023, 14115 : 146 - 157
[44] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[45] An iterative pruning algorithm for feedforward neural networks
Castellano, G
Fanelli, AM
Pelillo, M
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (03): : 519 - 531
[46] On rule pruning using fuzzy neural networks
Pal, NR
Pal, T
FUZZY SETS AND SYSTEMS, 1999, 106 (03) : 335 - 347
[47] Structured pruning of neural networks for constraints learning
Cacciola, Matteo
Frangioni, Antonio
Lodi, Andrea
Operations Research Letters, 2024, 57
[48] Evolving Better Initializations For Neural Networks With Pruning
Zhou, Ryan
Hu, Ting
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 703 - 706
[49] Automated Pruning of Neural Networks for Mobile Applications
Glinserer, Andreas
Lechner, Martin
Wendt, Alexander
2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,
[50] Online training and pruning of photonic neural networks
Zhang, Weipeng
Xu, Tengji
Zhang, Jiawei
Shastri, Bhavin J.
Huang, Chaoran
Prucnal, Paul
2023 IEEE PHOTONICS CONFERENCE, IPC, 2023,

← 1 2 3 4 5 →