Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization

被引：0

作者：

Peric, Zoran ^{[1
]}

Aleksic, Danijela ^{[2
]}

Nikolic, Jelena ^{[1
]}

Tomic, Stefan ^{[3
]}

机构：

[1] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia

[2] Telekom Srbija, Dept Mobile Network Nis, Vozdova 11, Nish 18000, Serbia

[3] Al Dar Univ Coll, Sch Engn & Technol, POB 35529, Dubai, U Arab Emirates

来源：

MATHEMATICS | 2022年 / 10卷 / 19期

关键词：

non-uniform quantization; support region; post-training quantization; quantized neural networks;

D O I：

10.3390/math10193435

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

With increased network downsizing and cost minimization in deployment of neural network (NN) models, the utilization of edge computing takes a significant place in modern artificial intelligence today. To bridge the memory constraints of less-capable edge systems, a plethora of quantizer models and quantization techniques are proposed for NN compression with the goal of enabling the fitting of the quantized NN (QNN) on the edge device and guaranteeing a high extent of accuracy preservation. NN compression by means of post-training quantization has attracted a lot of research attention, where the efficiency of uniform quantizers (UQs) has been promoted and heavily exploited. In this paper, we propose two novel non-uniform quantizers (NUQs) that prudently utilize one of the two properties of the simplest UQ. Although having the same quantization rule for specifying the support region, both NUQs have a different starting setting in terms of cell width, compared to a standard UQ. The first quantizer, named the simplest power-of-two quantizer (SPTQ), defines the width of cells that are multiplied by the power of two. As it is the case in the simplest UQ design, the representation levels of SPTQ are midpoints of the quantization cells. The second quantizer, named the modified SPTQ (MSPTQ), is a more competitive quantizer model, representing an enhanced version of SPTQ in which the quantizer decision thresholds are centered between the nearest representation levels, similar to the UQ design. These properties make the novel NUQs relatively simple. Unlike UQ, the quantization cells of MSPTQ are not of equal widths and the representation levels are not midpoints of the quantization cells. In this paper, we describe the design procedure of SPTQ and MSPTQ and we perform their optimization for the assumed Laplacian source. Afterwards, we perform post-training quantization by implementing SPTQ and MSPTQ, study the viability of QNN accuracy and show the implementation benefits over the case where UQ of an equal number of quantization cells is utilized in QNN for the same classification task. We believe that both NUQs are particularly substantial for memory-constrained environments, where simple and acceptably accurate solutions are of crucial importance.

引用

页数：21

共 50 条

[1] Non-uniform Step Size Quantization for Accurate Post-training Quantization
Oh, Sangyun
Sim, Hyeonuk
Kim, Jounghyun
Lee, Jongeun
COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 658 - 673
[2] Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error
Jeon, Yongkweon
Lee, Chungman
Cho, Eulrang
Ro, Yeonju
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12319 - 12328
[3] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
Yuan, Zhihang
Xue, Chenhao
Chen, Yiqi
Wu, Qiang
Sun, Guangyu
COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
[4] Loss aware post-training quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
Alex M. Bronstein
Avi Mendelson
Machine Learning, 2021, 110 : 3245 - 3262
[5] Post-Training Quantization for Vision Transformer
Liu, Zhenhua
Wang, Yunhe
Han, Kai
Zhang, Wei
Ma, Siwei
Gao, Wen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[6] Post-training Quantization on Diffusion Models
Shang, Yuzhang
Yuan, Zhihang
Xie, Bin
Wu, Bingzhe
Yan, Yan
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
[7] Attention Round for post-training quantization
Diao, Huabin
Li, Gongyan
Xu, Shaoyun
Kong, Chao
Wang, Wei
NEUROCOMPUTING, 2024, 565
[8] Loss aware post-training quantization
Nahshan, Yury
Chmiel, Brian
Baskin, Chaim
Zheltonozhskii, Evgenii
Banner, Ron
Bronstein, Alex M.
Mendelson, Avi
MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
[9] Post-Training Sparsity-Aware Quantization
Shomron, Gil
Gabbay, Freddy
Kurzum, Samer
Weiser, Uri
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] Uniform and non-uniform quantization of Gaussian processes
Seleznjev, Oleg
Shykula, Mykola
MATHEMATICAL COMMUNICATIONS, 2012, 17 (02) : 447 - 460

← 1 2 3 4 5 →