Compact Powers-of-Two: An Efficient Non-Uniform Quantization for Deep Neural Networks

被引:0
|
作者
Geng, Xinkuang [1 ]
Liu, Siting [2 ]
Jiang, Jianfei [1 ]
Jiang, Kai [3 ]
Jiang, Honglan [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Micronano Elect, Shanghai, Peoples R China
[2] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[3] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.23919/DATE58400.2024.10546652
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To reduce the demands for computation and memory of deep neural networks (DNNs), various quantization techniques have been extensively investigated. However, conventional methods cannot effectively capture the intrinsic data characteristics in DNNs, leading to a high accuracy degradation when employing low-bit-width quantization. In order to better align with the bell-shaped distribution, we propose an efficient non-uniform quantization scheme, denoted as compact powers-of-two (CPoT). Aiming to avoid the rigid resolution inherent in powers-of-two (PoT) without introducing new issues, we add a fractional part to its encoding, followed by a biasing operation to eliminate the unrepresentable region around 0. This approach effectively balances the grid resolution in both the vicinity of 0 and the edge region. To facilitate the hardware implementation, we optimize the dot product for CPoT based on the computational characteristics of the quantized DNNs, where the precomputable terms are extracted and incorporated into bias. Consequently, a multiply-accumulate (MAC) unit is designed for CPoT using shifters and look-up tables (LUTs). The experimental results show that, even with a certain level of approximation, our proposed CPoT outperforms state-of-the-art methods in data-free quantization (DFQ), a post-training quantization (PTQ) technique focusing on data privacy and computational efficiency. Furthermore, CPoT demonstrates superior efficiency in area and power compared to other methods in hardware implementation.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] A non-uniform quantization scheme for visualization of CT images
    Mehmood, Anam
    Khan, Ishtiaq Rasool
    Dawood, Hassan
    Dawood, Hussain
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (04) : 4311 - 4326
  • [42] A Trajectory Compression Algorithm Based on Non-uniform Quantization
    Lv, Chengjiao
    Chen, Feng
    Xu, Yongzhi
    Song, Junping
    Lv, Pin
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 2469 - 2474
  • [43] Extracting More Quantum Randomness With Non-Uniform Quantization
    Ji, Bai-Xiang
    Li, Jian
    Wang, Qin
    IEEE PHOTONICS JOURNAL, 2022, 14 (04):
  • [44] Efficient Broadcasting in Known Geometric Radio Networks with Non-uniform Ranges
    Gasieniec, Leszek
    Kowalski, Dariusz R.
    Lingas, Andrzej
    Wahlen, Martin
    DISTRIBUTED COMPUTING, PROCEEDINGS, 2008, 5218 : 274 - +
  • [45] Compact stars with non-uniform relativistic polytrope
    Nouh, Mohamed I.
    Foda, Mona M.
    Aboueisha, Mohamed S.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [46] Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition
    Aldonso Becerra
    J. Ismael de la Rosa
    Efrén González
    A. David Pedroza
    N. Iracemi Escalante
    Multimedia Tools and Applications, 2018, 77 : 27231 - 27267
  • [47] Non-uniform Label Smoothing for Diabetic Retinopathy Grading from Retinal Fundus Images with Deep Neural Networks
    Galdran, Adrian
    Chelbi, Jihed
    Kobi, Riadh
    Dolz, Jose
    Lombaert, Herve
    ben Ayed, Ismail
    Chakor, Hadi
    TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2020, 9 (02): : 1 - 8
  • [48] Speech Recognition using Deep Neural Networks Trained with Non-uniform Frame-Level Cost Functions
    Becerra, Aldonso
    Ismael de la Rosa, J.
    Gonzalez, Efren
    David Pedroza, A.
    Manuel Martinez, J.
    Iracemi Escalante, N.
    2017 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC), 2017,
  • [49] Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition
    Becerra, Aldonso
    Ismael de la Rosa, J.
    Gonzalez, Efren
    David Pedroza, A.
    Iracemi Escalante, N.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (20) : 27231 - 27267
  • [50] UTransNet: An efficient hybrid architecture of convolutional neural networks and transformer for the approximation of non-uniform steady laminar flow
    Wang, Weiqing
    Yin, Tianle
    Pang, Jing
    PHYSICS OF FLUIDS, 2025, 37 (03)