Overcoming Oscillations in Quantization-Aware Training

被引:0
|
作者
Nagel, Markus [1 ]
Fournarakis, Marios [1 ]
Bondarenko, Yelysei [1 ]
Blankevoort, Tijmen [1 ]
机构
[1] Qualcomm AI Res, San Diego, CA 92121 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When training neural networks with simulated quantization, we observe that quantized weights can, rather unexpectedly, oscillate between two grid-points. The importance of this effect and its impact on quantization-aware training (QAT) are not well-understood or investigated in literature. In this paper, we delve deeper into the phenomenon of weight oscillations and show that it can lead to a significant accuracy degradation due to wrongly estimated batch-normalization statistics during inference and increased noise during training. These effects are particularly pronounced in low-bit (<= 4-bits) quantization of efficient networks with depth-wise separable layers, such as MobileNets and EfficientNets. In our analysis we investigate several previously proposed QAT algorithms and show that most of these are unable to overcome oscillations. Finally, we propose two novel QAT algorithms to overcome oscillations during training: oscillation dampening and iterative weight freezing. We demonstrate that our algorithms achieve state-of-the-art accuracy for low-bit (3 & 4 bits) weight and activation quantization of efficient architectures, such as MobileNetV2, MobileNetV3, and EfficentNet-lite on ImageNet. Our source code is available at https://github.com/qualcomm-ai- research/oscillations-qat.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference
    Hawks, Benjamin
    Duarte, Javier
    Fraser, Nicholas J.
    Pappalardo, Alessandro
    Nhan Tran
    Umuroglu, Yaman
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [42] QuantBayes: Weight Optimization for Memristive Neural Networks via Quantization-Aware Bayesian Inference
    Zhou, Yue
    Hu, Xiaofang
    Wang, Lidan
    Zhou, Guangdong
    Duan, Shukai
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (12) : 4851 - 4861
  • [43] QPA: A Quantization-Aware Piecewise Polynomial Approximation Methodology for Hardware-Efficient Implementations
    Geng, Haoran
    Chen, Xiaoliang
    Zhao, Ning
    Du, Yuan
    Du, Li
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (07) : 931 - 944
  • [44] Quantization-Aware Binaural MWF Based Noise Reduction Incorporating External Wireless Devices
    Zhang, Jie
    Li, Changheng
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3118 - 3131
  • [45] Quantization-Aware NN Layers with High-throughput FPGA Implementation for Edge AI
    Pistellato, Mara
    Bergamasco, Filippo
    Bigaglia, Gianluca
    Gasparetto, Andrea
    Albarelli, Andrea
    Boschetti, Marco
    Passerone, Roberto
    [J]. SENSORS, 2023, 23 (10)
  • [46] Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge
    Lu, Yao
    Rodriguez, Hiram Rayo Torres
    Vogel, Sebastian
    van de Waterlaat, Nick
    Jancura, Pavol
    [J]. PROCEEDINGS 2023 IEEE/ACM INTERNATIONAL WORKSHOP ON COMPILERS, DEPLOYMENT, AND TOOLING FOR EDGE AI, CODAI 2023, 2023, : 1 - 5
  • [47] Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
    Li, Zhengang
    Lu, Alec
    Xie, Yanyue
    Kong, Zhenglun
    Sun, Mengshu
    Tang, Hao
    Xue, Zhong Jia
    Dong, Peiyan
    Ding, Caiwen
    Wang, Yanzhi
    Lin, Xue
    Fang, Zhenman
    [J]. PROCEEDINGS OF THE 38TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2024, 2024, : 324 - 337
  • [48] A reconfigurable multi-precision quantization-aware nonlinear activation function hardware module for DNNs
    Hong, Qi
    Liu, Zhiming
    Long, Qiang
    Tong, Hao
    Zhang, Tianxu
    Zhu, Xiaowen
    Zhao, Yunong
    Ru, Hua
    Zha, Yuxing
    Zhou, Ziyuan
    Wu, Jiashun
    Tan, Hongtao
    Hong, Weiqiang
    Xu, Yaohua
    Guo, Xiaohui
    [J]. MICROELECTRONICS JOURNAL, 2024, 151
  • [49] One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training
    Ma, Lianbo
    Zhou, Yuee
    Ma, Jianlun
    Yu, Guo
    Li, Qing
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14246 - 14254
  • [50] Towards Accurate and High-Speed Spiking Neuromorphic Systems with Data Quantization-Aware Deep Networks
    Liu, Fuqiang
    Liu, Chenchen
    [J]. 2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,