Compact Powers-of-Two: An Efficient Non-Uniform Quantization for Deep Neural Networks

被引:0
|
作者
Geng, Xinkuang [1 ]
Liu, Siting [2 ]
Jiang, Jianfei [1 ]
Jiang, Kai [3 ]
Jiang, Honglan [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Micronano Elect, Shanghai, Peoples R China
[2] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[3] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.23919/DATE58400.2024.10546652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To reduce the demands for computation and memory of deep neural networks (DNNs), various quantization techniques have been extensively investigated. However, conventional methods cannot effectively capture the intrinsic data characteristics in DNNs, leading to a high accuracy degradation when employing low-bit-width quantization. In order to better align with the bell-shaped distribution, we propose an efficient non-uniform quantization scheme, denoted as compact powers-of-two (CPoT). Aiming to avoid the rigid resolution inherent in powers-of-two (PoT) without introducing new issues, we add a fractional part to its encoding, followed by a biasing operation to eliminate the unrepresentable region around 0. This approach effectively balances the grid resolution in both the vicinity of 0 and the edge region. To facilitate the hardware implementation, we optimize the dot product for CPoT based on the computational characteristics of the quantized DNNs, where the precomputable terms are extracted and incorporated into bias. Consequently, a multiply-accumulate (MAC) unit is designed for CPoT using shifters and look-up tables (LUTs). The experimental results show that, even with a certain level of approximation, our proposed CPoT outperforms state-of-the-art methods in data-free quantization (DFQ), a post-training quantization (PTQ) technique focusing on data privacy and computational efficiency. Furthermore, CPoT demonstrates superior efficiency in area and power compared to other methods in hardware implementation.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Efficient Hardware Implementation of Cellular Neural Networks with Powers-of-Two Based Incremental Quantization
    Xu, Xiaowei
    Lu, Qing
    Wang, Tianchen
    Liu, Jinglan
    Hu, Yu
    Shi, Yiyu
    PROCEEDINGS OF NEUROMORPHIC COMPUTING SYMPOSIUM (NCS 2017), 2017,
  • [2] CNQ: Compressor-Based Non-uniform Quantization of Deep Neural Networks
    YUAN Yong
    CHEN Chen
    HU Xiyuan
    PENG Silong
    Chinese Journal of Electronics, 2020, 29 (06) : 1126 - 1133
  • [3] UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks
    Baskin, Chaim
    Liss, Natan
    Schwartz, Eli
    Zheltonozhskii, Evgenii
    Giryes, Raja
    Bronstein, Alex M.
    Mendelson, Avi
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 37 (1-4): : 1 - 4
  • [4] CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for Energy-Efficient Low-precision Deep Convolutional Neural Networks
    Hanif, Muhammad Abdullah
    Sarda, Giuseppe Maria
    Marchisio, Alberto
    Masera, Guido
    Martina, Maurizio
    Shafique, Muhammad
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [5] An Energy-and-Area-Efficient CNN Accelerator for Universal Powers-of-Two Quantization
    Xia, Tian
    Zhao, Boran
    Ma, Jian
    Fu, Gelin
    Zhao, Wenzhe
    Zheng, Nanning
    Ren, Pengju
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (03) : 1242 - 1255
  • [6] Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer
    Seo, Sanghyun
    Kim, Juntae
    APPLIED SCIENCES-BASEL, 2019, 9 (12):
  • [7] DEEP IMAGE COMPRESSION WITH ITERATIVE NON-UNIFORM QUANTIZATION
    Cai, Jianrui
    Zhang, Lei
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 450 - 454
  • [8] ROBUST AND COMPUTATIONALLY-EFFICIENT ANOMALY DETECTION USING POWERS-OF-TWO NETWORKS
    Muneeb, Usama
    Koyuncu, Erdem
    Keshtkarjahromi, Yasaman
    Seferoglu, Hulya
    Erdent, Mehmet Fatih
    Cetin, A. Enis
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2992 - 2996
  • [9] Non-uniform Piecewise Linear Activation Functions in Deep Neural Networks
    Zhu, Zezhou
    Dong, Yuan
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2107 - 2113
  • [10] Deep Convolutional Neural Networks for Dense Non-Uniform Motion Deblurring
    Cronje, Jaco
    2015 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2015,