EFFICIENT INFERENCE OF IMAGE-BASED NEURAL NETWORK MODELS IN RECONFIGURABLE SYSTEMS WITH PRUNING AND QUANTIZATION

被引：3

作者：

Flich, Jose ^{[1
]}

Medina, Laura ^{[1
]}

Catalan, Izan ^{[1
]}

Hernandez, Carles ^{[1
]}

Bragagnolo, Andrea ^{[2
,3
]}

Auzanneau, Fabrice ^{[4
]}

Briand, David ^{[4
]}

机构：

[1] Univ Politecn Valencia, Valencia, Spain

[2] Univ Torino, Dipartimento Informat, Turin, Italy

[3] Synesthesia Srl, Corso Dante 118, I-10126 Turin, Italy

[4] Univ Paris Saclay, CEA, List, F-91120 Palaiseau, France

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2022年

关键词：

FPGA; quantization; pruning; inference;

D O I：

10.1109/ICIP46576.2022.9897752

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems, model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfigurable systems such as FPGAs for the embedded system to work properly. In this paper, we present HLSinf, an open source framework for the development of custom NN accelerators for FPGAs which provides efficient support to quantized and pruned NN models. With HLSinf, significant inference speedups can be obtained for typical medical image-based applications. In particular, we obtain up to 90x speedup factor when we combine quantization/pruning with the flexibility of HLSinf compared to CPU.

引用

页码：2491 / 2495

页数：5

共 50 条

[1] Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference
Hawks, Benjamin
Duarte, Javier
Fraser, Nicholas J.
Pappalardo, Alessandro
Nhan Tran
Umuroglu, Yaman
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
[2] Verification of Image-Based Neural Network Controllers Using Generative Models
Katz, Sydney M.
Corso, Anthony L.
Strong, Christopher A.
Kochenderfer, Mykel J.
JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2022, 19 (09): : 574 - 584
[3] Verification of Image-based Neural Network Controllers Using Generative Models
Katz, Sydney M.
Corso, Anthony L.
Strong, Christopher A.
Kochenderfer, Mykel J.
2021 IEEE/AIAA 40TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2021,
[4] Image-based Tree Pruning
Liu, Wei
Kantor, George
De la Torre, Fernando
Zheng, Nanning
2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2012), 2012,
[5] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
Sasikumar, A.
Ravi, Logesh
Kotecha, Ketan
Indragandhi, V
Subramaniyaswamy, V
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
[6] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
A, Sasikumar
Ravi, Logesh
Kotecha, Ketan
V, Indragandhi
V, Subramaniyaswamy
Computers and Electrical Engineering, 2022, 102
[7] Optimized Convolutional Neural Network at the IoT edge for image detection using pruning and quantization
Soumyalatha Naveen
Manjunath R Kounte
Multimedia Tools and Applications, 2025, 84 (9) : 5435 - 5455
[8] Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation
Chen, Chen
Yan, Sheng
Hao, Chengpeng
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 200 - 211
[9] DQI: A Dynamic Quantization Method for Efficient Convolutional Neural Network Inference Accelerators
Wang, Yun
Liu, Qiang
Yan, Shun
2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022), 2022, : 231 - 231
[10] Pruning and quantization for deep neural network acceleration: A survey
Liang, Tailin
Glossner, John
Wang, Lei
Shi, Shaobo
Zhang, Xiaotong
NEUROCOMPUTING, 2021, 461 : 370 - 403

← 1 2 3 4 5 →