Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization

被引:0
|
作者
Peric, Zoran [1 ]
Aleksic, Danijela [2 ]
Nikolic, Jelena [1 ]
Tomic, Stefan [3 ]
机构
[1] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia
[2] Telekom Srbija, Dept Mobile Network Nis, Vozdova 11, Nish 18000, Serbia
[3] Al Dar Univ Coll, Sch Engn & Technol, POB 35529, Dubai, U Arab Emirates
关键词
non-uniform quantization; support region; post-training quantization; quantized neural networks;
D O I
10.3390/math10193435
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
With increased network downsizing and cost minimization in deployment of neural network (NN) models, the utilization of edge computing takes a significant place in modern artificial intelligence today. To bridge the memory constraints of less-capable edge systems, a plethora of quantizer models and quantization techniques are proposed for NN compression with the goal of enabling the fitting of the quantized NN (QNN) on the edge device and guaranteeing a high extent of accuracy preservation. NN compression by means of post-training quantization has attracted a lot of research attention, where the efficiency of uniform quantizers (UQs) has been promoted and heavily exploited. In this paper, we propose two novel non-uniform quantizers (NUQs) that prudently utilize one of the two properties of the simplest UQ. Although having the same quantization rule for specifying the support region, both NUQs have a different starting setting in terms of cell width, compared to a standard UQ. The first quantizer, named the simplest power-of-two quantizer (SPTQ), defines the width of cells that are multiplied by the power of two. As it is the case in the simplest UQ design, the representation levels of SPTQ are midpoints of the quantization cells. The second quantizer, named the modified SPTQ (MSPTQ), is a more competitive quantizer model, representing an enhanced version of SPTQ in which the quantizer decision thresholds are centered between the nearest representation levels, similar to the UQ design. These properties make the novel NUQs relatively simple. Unlike UQ, the quantization cells of MSPTQ are not of equal widths and the representation levels are not midpoints of the quantization cells. In this paper, we describe the design procedure of SPTQ and MSPTQ and we perform their optimization for the assumed Laplacian source. Afterwards, we perform post-training quantization by implementing SPTQ and MSPTQ, study the viability of QNN accuracy and show the implementation benefits over the case where UQ of an equal number of quantization cells is utilized in QNN for the same classification task. We believe that both NUQs are particularly substantial for memory-constrained environments, where simple and acceptably accurate solutions are of crucial importance.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Non-uniform Step Size Quantization for Accurate Post-training Quantization
    Oh, Sangyun
    Sim, Hyeonuk
    Kim, Jounghyun
    Lee, Jongeun
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 658 - 673
  • [2] Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error
    Jeon, Yongkweon
    Lee, Chungman
    Cho, Eulrang
    Ro, Yeonju
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12319 - 12328
  • [3] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
    Yuan, Zhihang
    Xue, Chenhao
    Chen, Yiqi
    Wu, Qiang
    Sun, Guangyu
    COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
  • [4] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [5] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [7] Attention Round for post-training quantization
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    NEUROCOMPUTING, 2024, 565
  • [8] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
  • [9] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] Uniform and non-uniform quantization of Gaussian processes
    Seleznjev, Oleg
    Shykula, Mykola
    MATHEMATICAL COMMUNICATIONS, 2012, 17 (02) : 447 - 460