SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

被引:68
|
作者
Faraone, Julian [1 ]
Fraser, Nicholas [2 ]
Blott, Michaela [2 ]
Leong, Philip H. W. [1 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] Xilinx Res Labs, Dublin, Ireland
关键词
D O I
10.1109/CVPR.2018.00452
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric code book for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at http s ://www.github.com/julianfaraone/SYQ.
引用
收藏
页码:4300 / 4309
页数:10
相关论文
共 50 条
  • [31] Filter Pruning for Efficient Transfer Learning in Deep Convolutional Neural Networks
    Reinhold, Caique
    Roisenberg, Mauro
    ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 191 - 202
  • [32] Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge
    Lu, Yao
    Rodriguez, Hiram Rayo Torres
    Vogel, Sebastian
    van de Waterlaat, Nick
    Jancura, Pavol
    PROCEEDINGS 2023 IEEE/ACM INTERNATIONAL WORKSHOP ON COMPILERS, DEPLOYMENT, AND TOOLING FOR EDGE AI, CODAI 2023, 2023, : 1 - 5
  • [33] REINFORCEMENT LEARNING-BASED LAYER-WISE QUANTIZATION FOR LIGHTWEIGHT DEEP NEURAL NETWORKS
    Jung, Juri
    Kim, Jonghee
    Kim, Youngeun
    Kim, Changick
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 3070 - 3074
  • [34] DYNAMICS OF LEARNING IN SYMMETRIC AND ASYMMETRIC NEURAL NETWORKS
    KOHRING, G
    NEURAL NETWORKS FROM MODELS TO APPLICATIONS, 1989, : 227 - 234
  • [35] Joint Pruning and Channel-Wise Mixed-Precision Quantization for Efficient Deep Neural Networks
    Motetti, Beatrice Alessandra
    Risso, Matteo
    Burrello, Alessio
    Macii, Enrico
    Poncino, Massimo
    Pagliari, Daniele Jahier
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (11) : 2619 - 2633
  • [36] Quantization of deep neural networks for accumulator-constrained processors
    de Bruin, Barry
    Zivkovic, Zoran
    Corporaal, Henk
    MICROPROCESSORS AND MICROSYSTEMS, 2020, 72
  • [37] MEMORIZATION CAPACITY OF DEEP NEURAL NETWORKS UNDER PARAMETER QUANTIZATION
    Boo, Yoonho
    Shin, Sungho
    Sung, Wonyong
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1383 - 1387
  • [38] A Deep Look into Logarithmic Quantization of Model Parameters in Neural Networks
    Cai, Jingyong
    Takemoto, Masashi
    Nakajo, Hironori
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY (IAIT2018), 2018,
  • [39] Weighted-Entropy-based Quantization for Deep Neural Networks
    Park, Eunhyeok
    Ahn, Junwhan
    Yoo, Sungjoo
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7197 - 7205
  • [40] ALPS: Adaptive Quantization of Deep Neural Networks with GeneraLized PositS
    Langroudi, Hamed F.
    Karia, Vedant
    Carmichael, Zachariah
    Zyarah, Abdullah
    Pandit, Tej
    Gustafson, John L.
    Kudithipudi, Dhireesha
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3094 - 3103