LKBQ: PUSHING THE LIMIT OF POST-TRAINING QUANTIZATION TO EXTREME 1 BIT

被引：1

作者：

Li, Tianxiang ^{[1
]}

Chen, Bin ^{[2
,3
]}

Wang, Qian-Wei ^{[1
,3
]}

Huang, Yujun ^{[1
,3
]}

Xia, Shu-Tao ^{[1
,3
]}

机构：

[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China

[2] Harbin Inst Technol, Shenzhen, Peoples R China

[3] Peng Cheng Lab, Shenzhen, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2023年

基金：

中国国家自然科学基金;

关键词：

post-training quantization; self-knowledge distillation; binary weight network;

D O I：

10.1109/ICIP49359.2023.10222555

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent advances have shown the potential for post-training quantization (PTQ) to reduce excessive hardware resources and quantize deep models to low bits in a short time, compared with Quantization-Aware Training (QAT). However, existing PTQ approaches lose a lot of accuracies when quantizing the model to extremely low bits, e.g., 1 bit. In this work, we propose layer-by-layer self-knowledge distillation binary post-training quantization (LKBQ), the first method capable of quantizing the weights of neural networks to 1 bit in PTQ domain. We show that careful use of layer-by-layer self-distillation within the LKBQ can provide a significant performance boost. Furthermore, our evaluation results show that the initialization of quantized network weights can have a huge impact on the results. Then we propose three methods for weight initialization. Finally, in light of the characteristics of the binarized network, we propose a method named gradient scaling to further improve efficiency. Our experiments show that LKBQ pushes the limit of PTQ to extreme 1-bit for the first time.

引用

页码：1775 / 1779

页数：5

共 50 条

[1] PTMQ: Post-training Multi-Bit Quantization of Neural Networks
Xu, Ke
Li, Zhongcheng
Wang, Shanshan
Zhang, Xingyi
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 16193 - 16201
[2] Optimization-Based Post-Training Quantization With Bit-Split and Stitching
Wang, Peisong
Chen, Weihan
He, Xiangyu
Chen, Qiang
Liu, Qingshan
Cheng, Jian
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2119 - 2135
[3] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
Lin, Chen
Peng, Bo
Li, Zheyang
Tan, Wenming
Ren, Ye
Xiao, Jun
Pu, Shiliang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16196 - 16205
[4] Loss aware post-training quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
Alex M. Bronstein
Avi Mendelson
Machine Learning, 2021, 110 : 3245 - 3262
[5] Post-Training Quantization for Vision Transformer
Liu, Zhenhua
Wang, Yunhe
Han, Kai
Zhang, Wei
Ma, Siwei
Gao, Wen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[6] Post-training Quantization on Diffusion Models
Shang, Yuzhang
Yuan, Zhihang
Xie, Bin
Wu, Bingzhe
Yan, Yan
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
[7] Attention Round for post-training quantization
Diao, Huabin
Li, Gongyan
Xu, Shaoyun
Kong, Chao
Wang, Wei
NEUROCOMPUTING, 2024, 565
[8] Loss aware post-training quantization
Nahshan, Yury
Chmiel, Brian
Baskin, Chaim
Zheltonozhskii, Evgenii
Banner, Ron
Bronstein, Alex M.
Mendelson, Avi
MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
[9] Post-Training Sparsity-Aware Quantization
Shomron, Gil
Gabbay, Freddy
Kurzum, Samer
Weiser, Uri
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] Comparative Analysis of the Robustness of 3-bit PoTQ and UQ and their Application in Post-training Quantization
Nikolic, Jelena
Peric, Zoran
Tomic, Stefan
Jovanovic, Aleksandra
Aleksic, Danijela
Peric, Sofija
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2024, 24 (04) : 47 - 56

← 1 2 3 4 5 →