SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

被引:68
|
作者
Faraone, Julian [1 ]
Fraser, Nicholas [2 ]
Blott, Michaela [2 ]
Leong, Philip H. W. [1 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] Xilinx Res Labs, Dublin, Ireland
关键词
D O I
10.1109/CVPR.2018.00452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric code book for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at http s ://www.github.com/julianfaraone/SYQ.
引用
收藏
页码:4300 / 4309
页数:10
相关论文
共 50 条
  • [1] Bit Efficient Quantization for Deep Neural Networks
    Nayak, Prateeth
    Zhang, David
    Chai, Sek
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
  • [2] Space Efficient Quantization for Deep Convolutional Neural Networks
    Zhao, Dong-Di
    Li, Fan
    Sharif, Kashif
    Xia, Guang-Min
    Wang, Yu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) : 305 - 317
  • [3] Space Efficient Quantization for Deep Convolutional Neural Networks
    Dong-Di Zhao
    Fan Li
    Kashif Sharif
    Guang-Min Xia
    Yu Wang
    Journal of Computer Science and Technology, 2019, 34 : 305 - 317
  • [4] Learning accelerator of deep neural networks with logarithmic quantization
    Ueki, Takeo
    Iwai, Keisuke
    Matsubara, Takashi
    Kurokawa, Takakazu
    2018 7TH INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2018), 2018, : 634 - 638
  • [5] Optimized Quantization for Convolutional Deep Neural Networks in Federated Learning
    Kim, You Jun
    Hong, Choong Seon
    APNOMS 2020: 2020 21ST ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS), 2020, : 150 - 154
  • [6] ReLeQ : A Reinforcement Learning Approach for Automatic Deep Quantization of Neural Networks
    Elthakeb, Ahmed T.
    Pilligundla, Prannoy
    Mireshghallah, Fatemehsadat
    Yazdanbakhsh, Amir
    Esmaeilzadeh, Hadi
    IEEE MICRO, 2020, 40 (05) : 37 - 44
  • [7] Robust Quantization of Deep Neural Networks
    Kim, Youngseok
    Lee, Junyeol
    Kim, Younghoon
    Seo, Jiwon
    PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '20), 2020, : 74 - 84
  • [8] Stable tensor neural networks for efficient deep learning
    Newman, Elizabeth
    Horesh, Lior
    Avron, Haim
    Kilmer, Misha E.
    FRONTIERS IN BIG DATA, 2024, 7
  • [9] Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
    Latotzke, Cecilia
    Balim, Batuhan
    Gemmeke, Tobias
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1559 - 1566
  • [10] Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks
    Yinpeng Dong
    Renkun Ni
    Jianguo Li
    Yurong Chen
    Hang Su
    Jun Zhu
    International Journal of Computer Vision, 2019, 127 : 1629 - 1642