Flexible Quantization for Efficient Convolutional Neural Networks

被引:3
|
作者
Zacchigna, Federico Giordano [1 ]
Lew, Sergio [2 ,3 ]
Lutenberg, Ariel [1 ,3 ]
机构
[1] Univ Buenos Aires, Fac Ingn FIUBA, Lab Sistemas Embebidos LSE, C1063ACV, Buenos Aires, Argentina
[2] Univ Buenos Aires, Fac Ingn FIUBA, Inst Ingn Biomed IIBM, C1063ACV, Buenos Aires, Argentina
[3] Consejo Nacl Invest Cient & Tecn CONICET, C1425FQB, Buenos Aires, Argentina
关键词
CNN; quantization; uniform; non-uniform; mixed-precision; FPGA; ASIC; edge devices; embedded systems; CNN;
D O I
10.3390/electronics13101923
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work focuses on the efficient quantization of convolutional neural networks (CNNs). Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a novel quantization methodology that combines the benefits of non-uniform quantization, such as high compression levels, with the advantages of uniform quantization, which enables an efficient implementation in fixed-point hardware. NUUQ is based on decoupling the quantization levels from the number of bits. This decoupling allows for a trade-off between the spatial and temporal complexity of the implementation, which can be leveraged to further reduce the spatial complexity of the CNN, without a significant performance loss. Additionally, we explore different quantization configurations and address typical use cases. The NUUQ algorithm demonstrates the capability to achieve compression levels equivalent to 2 bits without an accuracy loss and even levels equivalent to similar to 1.58 bits, but with a loss in performance of only similar to 0.6%.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Quantization of Deep Convolutional Networks
    Huang, Yea-Shuan
    Slot, Charles Djimy
    Yu, Chang Wu
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [32] XNORAM: An Efficient Computing-in-Memory Architecture for Binary Convolutional Neural Networks with Flexible Dataflow Mapping
    Liu, Shiwei
    Zhu, Haozhe
    Chen, Chixiao
    Zhang, Lihua
    Shi, C-J Richard
    2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2020), 2020, : 21 - 25
  • [33] An Energy-Efficient and Flexible Accelerator based on Reconfigurable Computing for Multiple Deep Convolutional Neural Networks
    Yang, Chen
    Zhang, HaiBo
    Wang, XiaoLi
    Geng, Li
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1389 - 1391
  • [34] An Efficient Accelerator with Winograd for Novel Convolutional Neural Networks
    Lin, Zhijian
    Zhang, Meng
    Weng, Dongpeng
    Liu, Fei
    2022 5TH INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS AND SIMULATION (ICCSS 2022), 2022, : 126 - 130
  • [35] Efficient quantum state tomography with convolutional neural networks
    Schmale, Tobias
    Reh, Moritz
    Gaerttner, Martin
    NPJ QUANTUM INFORMATION, 2022, 8 (01)
  • [36] An Efficient Dataflow Mapping Method for Convolutional Neural Networks
    Zhuangzhuang Liu
    Huaxi Gu
    Bowen Zhang
    Canran Shi
    Neural Processing Letters, 2022, 54 : 1075 - 1090
  • [37] Memory Efficient Binary Convolutional Neural Networks on Microcontrollers
    Sakr, Fouad
    Berta, Riccardo
    Doyle, Joseph
    Younes, Hamoud
    De Gloria, Alessandro
    Bellotti, Francesco
    2022 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING & COMMUNICATIONS (IEEE EDGE 2022), 2022, : 169 - 177
  • [38] Fast and Efficient Implementation of Convolutional Neural Networks on FPGA
    Podili, Abhinav
    Zhang, Chi
    Prasanna, Viktor
    2017 IEEE 28TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2017, : 11 - 18
  • [39] A Power-efficient Accelerator for Convolutional Neural Networks
    Sun, Fan
    Wang, Chao
    Gong, Lei
    Xu, Chongchong
    Zhang, Yiwei
    Lu, Yuntao
    Li, Xi
    Zhou, Xuehai
    2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2017, : 631 - 632
  • [40] Efficient Utilization of FPGA Multipliers for Convolutional Neural Networks
    Boulasikis, M. A.
    Birbas, M.
    Tsafas, N.
    Kanakaris, N.
    2021 10TH INTERNATIONAL CONFERENCE ON MODERN CIRCUITS AND SYSTEMS TECHNOLOGIES (MOCAST), 2021,