EFFICIENT INFERENCE OF IMAGE-BASED NEURAL NETWORK MODELS IN RECONFIGURABLE SYSTEMS WITH PRUNING AND QUANTIZATION

被引:3
|
作者
Flich, Jose [1 ]
Medina, Laura [1 ]
Catalan, Izan [1 ]
Hernandez, Carles [1 ]
Bragagnolo, Andrea [2 ,3 ]
Auzanneau, Fabrice [4 ]
Briand, David [4 ]
机构
[1] Univ Politecn Valencia, Valencia, Spain
[2] Univ Torino, Dipartimento Informat, Turin, Italy
[3] Synesthesia Srl, Corso Dante 118, I-10126 Turin, Italy
[4] Univ Paris Saclay, CEA, List, F-91120 Palaiseau, France
关键词
FPGA; quantization; pruning; inference;
D O I
10.1109/ICIP46576.2022.9897752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems, model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfigurable systems such as FPGAs for the embedded system to work properly. In this paper, we present HLSinf, an open source framework for the development of custom NN accelerators for FPGAs which provides efficient support to quantized and pruned NN models. With HLSinf, significant inference speedups can be obtained for typical medical image-based applications. In particular, we obtain up to 90x speedup factor when we combine quantization/pruning with the flexibility of HLSinf compared to CPU.
引用
收藏
页码:2491 / 2495
页数:5
相关论文
共 50 条
  • [1] Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference
    Hawks, Benjamin
    Duarte, Javier
    Fraser, Nicholas J.
    Pappalardo, Alessandro
    Nhan Tran
    Umuroglu, Yaman
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [2] Verification of Image-Based Neural Network Controllers Using Generative Models
    Katz, Sydney M.
    Corso, Anthony L.
    Strong, Christopher A.
    Kochenderfer, Mykel J.
    JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2022, 19 (09): : 574 - 584
  • [3] Verification of Image-based Neural Network Controllers Using Generative Models
    Katz, Sydney M.
    Corso, Anthony L.
    Strong, Christopher A.
    Kochenderfer, Mykel J.
    2021 IEEE/AIAA 40TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2021,
  • [4] Image-based Tree Pruning
    Liu, Wei
    Kantor, George
    De la Torre, Fernando
    Zheng, Nanning
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2012), 2012,
  • [5] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
    Sasikumar, A.
    Ravi, Logesh
    Kotecha, Ketan
    Indragandhi, V
    Subramaniyaswamy, V
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
  • [6] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
    A, Sasikumar
    Ravi, Logesh
    Kotecha, Ketan
    V, Indragandhi
    V, Subramaniyaswamy
    Computers and Electrical Engineering, 2022, 102
  • [7] Optimized Convolutional Neural Network at the IoT edge for image detection using pruning and quantization
    Soumyalatha Naveen
    Manjunath R Kounte
    Multimedia Tools and Applications, 2025, 84 (9) : 5435 - 5455
  • [8] Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation
    Chen, Chen
    Yan, Sheng
    Hao, Chengpeng
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 200 - 211
  • [9] DQI: A Dynamic Quantization Method for Efficient Convolutional Neural Network Inference Accelerators
    Wang, Yun
    Liu, Qiang
    Yan, Shun
    2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022), 2022, : 231 - 231
  • [10] Pruning and quantization for deep neural network acceleration: A survey
    Liang, Tailin
    Glossner, John
    Wang, Lei
    Shi, Shaobo
    Zhang, Xiaotong
    NEUROCOMPUTING, 2021, 461 : 370 - 403