LKBQ: PUSHING THE LIMIT OF POST-TRAINING QUANTIZATION TO EXTREME 1 BIT

被引:1
|
作者
Li, Tianxiang [1 ]
Chen, Bin [2 ,3 ]
Wang, Qian-Wei [1 ,3 ]
Huang, Yujun [1 ,3 ]
Xia, Shu-Tao [1 ,3 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[2] Harbin Inst Technol, Shenzhen, Peoples R China
[3] Peng Cheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
post-training quantization; self-knowledge distillation; binary weight network;
D O I
10.1109/ICIP49359.2023.10222555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances have shown the potential for post-training quantization (PTQ) to reduce excessive hardware resources and quantize deep models to low bits in a short time, compared with Quantization-Aware Training (QAT). However, existing PTQ approaches lose a lot of accuracies when quantizing the model to extremely low bits, e.g., 1 bit. In this work, we propose layer-by-layer self-knowledge distillation binary post-training quantization (LKBQ), the first method capable of quantizing the weights of neural networks to 1 bit in PTQ domain. We show that careful use of layer-by-layer self-distillation within the LKBQ can provide a significant performance boost. Furthermore, our evaluation results show that the initialization of quantized network weights can have a huge impact on the results. Then we propose three methods for weight initialization. Finally, in light of the characteristics of the binarized network, we propose a method named gradient scaling to further improve efficiency. Our experiments show that LKBQ pushes the limit of PTQ to extreme 1-bit for the first time.
引用
收藏
页码:1775 / 1779
页数:5
相关论文
共 50 条
  • [1] PTMQ: Post-training Multi-Bit Quantization of Neural Networks
    Xu, Ke
    Li, Zhongcheng
    Wang, Shanshan
    Zhang, Xingyi
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 16193 - 16201
  • [2] Optimization-Based Post-Training Quantization With Bit-Split and Stitching
    Wang, Peisong
    Chen, Weihan
    He, Xiangyu
    Chen, Qiang
    Liu, Qingshan
    Cheng, Jian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2119 - 2135
  • [3] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
    Lin, Chen
    Peng, Bo
    Li, Zheyang
    Tan, Wenming
    Ren, Ye
    Xiao, Jun
    Pu, Shiliang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16196 - 16205
  • [4] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [5] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [7] Attention Round for post-training quantization
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    NEUROCOMPUTING, 2024, 565
  • [8] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
  • [9] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] Comparative Analysis of the Robustness of 3-bit PoTQ and UQ and their Application in Post-training Quantization
    Nikolic, Jelena
    Peric, Zoran
    Tomic, Stefan
    Jovanovic, Aleksandra
    Aleksic, Danijela
    Peric, Sofija
    ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2024, 24 (04) : 47 - 56