Compact Powers-of-Two: An Efficient Non-Uniform Quantization for Deep Neural Networks

被引：0

作者：

Geng, Xinkuang ^{[1
]}

Liu, Siting ^{[2
]}

Jiang, Jianfei ^{[1
]}

Jiang, Kai ^{[3
]}

Jiang, Honglan ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Micronano Elect, Shanghai, Peoples R China

[2] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China

[3] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China

来源：

2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.23919/DATE58400.2024.10546652

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To reduce the demands for computation and memory of deep neural networks (DNNs), various quantization techniques have been extensively investigated. However, conventional methods cannot effectively capture the intrinsic data characteristics in DNNs, leading to a high accuracy degradation when employing low-bit-width quantization. In order to better align with the bell-shaped distribution, we propose an efficient non-uniform quantization scheme, denoted as compact powers-of-two (CPoT). Aiming to avoid the rigid resolution inherent in powers-of-two (PoT) without introducing new issues, we add a fractional part to its encoding, followed by a biasing operation to eliminate the unrepresentable region around 0. This approach effectively balances the grid resolution in both the vicinity of 0 and the edge region. To facilitate the hardware implementation, we optimize the dot product for CPoT based on the computational characteristics of the quantized DNNs, where the precomputable terms are extracted and incorporated into bias. Consequently, a multiply-accumulate (MAC) unit is designed for CPoT using shifters and look-up tables (LUTs). The experimental results show that, even with a certain level of approximation, our proposed CPoT outperforms state-of-the-art methods in data-free quantization (DFQ), a post-training quantization (PTQ) technique focusing on data privacy and computational efficiency. Furthermore, CPoT demonstrates superior efficiency in area and power compared to other methods in hardware implementation.

引用

页数：6

共 50 条

[1] Efficient Hardware Implementation of Cellular Neural Networks with Powers-of-Two Based Incremental Quantization
Xu, Xiaowei
Lu, Qing
Wang, Tianchen
Liu, Jinglan
Hu, Yu
Shi, Yiyu
PROCEEDINGS OF NEUROMORPHIC COMPUTING SYMPOSIUM (NCS 2017), 2017,
[2] CNQ: Compressor-Based Non-uniform Quantization of Deep Neural Networks
YUAN Yong
CHEN Chen
HU Xiyuan
PENG Silong
Chinese Journal of Electronics, 2020, 29 (06) : 1126 - 1133
[3] UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
Baskin, Chaim
Liss, Natan
Schwartz, Eli
Zheltonozhskii, Evgenii
Giryes, Raja
Bronstein, Alex M.
Mendelson, Avi
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 37 (1-4): : 1 - 4
[4] CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for Energy-Efficient Low-precision Deep Convolutional Neural Networks
Hanif, Muhammad Abdullah
Sarda, Giuseppe Maria
Marchisio, Alberto
Masera, Guido
Martina, Maurizio
Shafique, Muhammad
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[5] An Energy-and-Area-Efficient CNN Accelerator for Universal Powers-of-Two Quantization
Xia, Tian
Zhao, Boran
Ma, Jian
Fu, Gelin
Zhao, Wenzhe
Zheng, Nanning
Ren, Pengju
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (03) : 1242 - 1255
[6] Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer
Seo, Sanghyun
Kim, Juntae
APPLIED SCIENCES-BASEL, 2019, 9 (12):
[7] DEEP IMAGE COMPRESSION WITH ITERATIVE NON-UNIFORM QUANTIZATION
Cai, Jianrui
Zhang, Lei
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 450 - 454
[8] ROBUST AND COMPUTATIONALLY-EFFICIENT ANOMALY DETECTION USING POWERS-OF-TWO NETWORKS
Muneeb, Usama
Koyuncu, Erdem
Keshtkarjahromi, Yasaman
Seferoglu, Hulya
Erdent, Mehmet Fatih
Cetin, A. Enis
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2992 - 2996
[9] Non-uniform Piecewise Linear Activation Functions in Deep Neural Networks
Zhu, Zezhou
Dong, Yuan
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2107 - 2113
[10] Deep Convolutional Neural Networks for Dense Non-Uniform Motion Deblurring
Cronje, Jaco
2015 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2015,

← 1 2 3 4 5 →