Flexible Quantization for Efficient Convolutional Neural Networks

被引:3
|
作者
Zacchigna, Federico Giordano [1 ]
Lew, Sergio [2 ,3 ]
Lutenberg, Ariel [1 ,3 ]
机构
[1] Univ Buenos Aires, Fac Ingn FIUBA, Lab Sistemas Embebidos LSE, C1063ACV, Buenos Aires, Argentina
[2] Univ Buenos Aires, Fac Ingn FIUBA, Inst Ingn Biomed IIBM, C1063ACV, Buenos Aires, Argentina
[3] Consejo Nacl Invest Cient & Tecn CONICET, C1425FQB, Buenos Aires, Argentina
关键词
CNN; quantization; uniform; non-uniform; mixed-precision; FPGA; ASIC; edge devices; embedded systems; CNN;
D O I
10.3390/electronics13101923
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to similar to 1.58 bits, but with a loss in performance of only similar to 0.6%.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] An Efficient Accelerator for Sparse Convolutional Neural Networks
    You, Weijie
    Wu, Chang
    2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
  • [22] Efficient Computation of Robustness of Convolutional Neural Networks
    Arcaini, Paolo
    Bombarda, Andrea
    Bonfanti, Silvia
    Gargantini, Angelo
    THIRD IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST 2021), 2021, : 21 - 28
  • [23] Efficient Implementation of Convolutional Neural Networks on FPGA
    Hadnagy, A.
    Feher, B.
    Kovacshazy, T.
    2018 19TH INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2018, : 359 - 364
  • [24] Efficient Hardware Acceleration of Convolutional Neural Networks
    Kala, S.
    Jose, Babita R.
    Mathew, Jimson
    Nalesh, S.
    32ND IEEE INTERNATIONAL SYSTEM ON CHIP CONFERENCE (IEEE SOCC 2019), 2019, : 191 - 192
  • [25] RNA: A Flexible and Efficient Accelerator Based on Dynamically Reconfigurable Computing for Multiple Convolutional Neural Networks
    Yang, Chen
    Hou, Jia
    Wang, Yizhou
    Zhang, Haibo
    Wang, Xiaoli
    Geng, Li
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (16)
  • [26] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
    Yuan, Haiying
    Zeng, Zhiyong
    Cheng, Junpeng
    Li, Minghao
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (08) : 4370 - 4389
  • [27] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
    Haiying Yuan
    Zhiyong Zeng
    Junpeng Cheng
    Minghao Li
    Circuits, Systems, and Signal Processing, 2022, 41 : 4370 - 4389
  • [28] General Bitwidth Assignment for Efficient Deep Convolutional Neural Network Quantization
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) : 5253 - 5267
  • [29] Quantization and sparsity-aware processing for energy-efficient NVM-based convolutional neural networks
    Bao, Han
    Qin, Yifan
    Chen, Jia
    Yang, Ling
    Li, Jiancong
    Zhou, Houji
    Li, Yi
    Miao, Xiangshui
    FRONTIERS IN ELECTRONICS, 2022, 3
  • [30] Quantization of constrained processor data paths applied to Convolutional Neural Networks
    de Bruin, Barry
    Zivkovic, Zoran
    Corporaal, Henk
    2018 21ST EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2018), 2018, : 357 - 364