SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

被引:68
|
作者
Faraone, Julian [1 ]
Fraser, Nicholas [2 ]
Blott, Michaela [2 ]
Leong, Philip H. W. [1 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] Xilinx Res Labs, Dublin, Ireland
关键词
D O I
10.1109/CVPR.2018.00452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric code book for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at http s ://www.github.com/julianfaraone/SYQ.
引用
收藏
页码:4300 / 4309
页数:10
相关论文
共 50 条
  • [21] FIGHTING OVER-FITTING WITH QUANTIZATION FOR LEARNING DEEP NEURAL NETWORKS ON NOISY LABELS
    Tallec, Gauthier
    Yvinec, Edouard
    Dapogny, Arnaud
    Bailly, Kevin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 575 - 579
  • [22] Quantization of Deep Neural Networks for Accurate Edge Computing
    Chen, Wentao
    Qiu, Hailong
    Zhuang, Jian
    Zhang, Chutong
    Hu, Yu
    Lu, Qing
    Wang, Tianchen
    Shi, Yiyu
    Huang, Meiping
    Xu, Xiaowe
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2021, 17 (04)
  • [23] Quantization Effects of Deep Neural Networks on a FPGA platform
    Pohl, Daniel
    Vogel-Heuser, Birgit
    Krueger, Marius
    Echtler, Markus
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS 2024, 2024,
  • [24] Compact Powers-of-Two: An Efficient Non-Uniform Quantization for Deep Neural Networks
    Geng, Xinkuang
    Liu, Siting
    Jiang, Jianfei
    Jiang, Kai
    Jiang, Honglan
    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
  • [25] Structured Dynamic Precision for Deep Neural Networks Quantization
    Huang, Kai
    Li, Bowen
    Xiong, Dongliang
    Jiang, Haitian
    Jiang, Xiaowen
    Yan, Xiaolang
    Claesen, Luc
    Liu, Dehong
    Chen, Junjian
    Liu, Zhili
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (01)
  • [26] Efficient Hardware Design of Convolutional Neural Networks for Accelerated Deep Learning
    Khalil, Kasem
    Khan, Md Rahat
    Bayoumi, Magdy
    Sherif, Ahmed
    2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, : 1075 - 1079
  • [27] Computationally Efficient Training of Deep Neural Networks via Transfer Learning
    Oyen, Diane
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2019, 2019, 10996
  • [28] An efficient deep learning approach to identify dynamics in in vitro neural networks
    Pastore, Vito Paolo
    Parodi, Giulia
    Brofiga, Martina
    Massobrio, Paolo
    Chiappalone, Michela
    Odone, Francesca
    Martinoia, Sergio
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [29] Augmented Efficient BackProp for Backpropagation Learning in Deep Autoassociative Neural Networks
    Embrechts, Mark J.
    Hargis, Blake J.
    Linton, Jonathan D.
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [30] An Efficient Learning Algorithm for Direct Training Deep Spiking Neural Networks
    Zhu, Xiaolei
    Zhao, Baixin
    Ma, De
    Tang, Huajin
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (03) : 847 - 856