Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

被引：6

作者：

Ma, Yuexiao ^{[1
]}

Li, Huixia ^{[2
]}

Zheng, Xiawu ^{[3
]}

Xiao, Xuefeng ^{[2
]}

Wang, Rui ^{[2
]}

Wen, Shilei ^{[2
]}

Pan, Xin ^{[2
]}

Chao, Fei ^{[1
]}

Ji, Rongrong ^{[1
,4
]}

机构：

[1] Xiamen Univ, Minist Educ China, Key Lab Multimedia Trusted Percept & Efficient Co, Sch Informat, Xiamen 361005, Peoples R China

[2] ByteDance Inc, Beijing, Peoples R China

[3] Peng Cheng Lab, Shenzhen, Peoples R China

[4] Xiamen Univ, Shenzhen Res Inst, Shenzhen, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52729.2023.00768

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2 x0.5.

引用

页码：7950 / 7959

页数：10

共 50 条

[41] Hessian matrix-aware comprehensive post-training quantization for vision transformers
Zhang, Weixing
Tian, Zhuang
Lin, Nan
Yang, Cong
Chen, Yongxia
JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
[42] A novel framework for deployment of CNN models using post-training quantization on microcontroller
Sailesh, M.
Selvakumar, K.
Prasanth, Narayanan
MICROPROCESSORS AND MICROSYSTEMS, 2022, 94
[43] Towards Efficient Post-training Quantization of Pre-trained Language Models
Bai, Haoli
Hou, Lu
Shang, Lifeng
Jiang, Xin
King, Irwin
Lyu, Michael R.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[44] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
Lin, Chen
Peng, Bo
Li, Zheyang
Tan, Wenming
Ren, Ye
Xiao, Jun
Pu, Shiliang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16196 - 16205
[45] Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
Zhong, Yunshan
Huang, You
Hu, Jiawei
Zhang, Yuxin
Ji, Rongrong
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2676 - 2692
[46] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
Li, Zhikai
Xiao, Junrui
Yang, Lianwei
Gu, Qingyi
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17181 - 17190
[47] Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression
Shi, Junqi
Lu, Ming
Ma, Zhan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3082 - 3095
[48] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Liu, Yijiang
Yang, Huanrui
Dong, Zhen
Keutzer, Kurt
Du, Li
Zhang, Shanghang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20321 - 20330
[49] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
Liu, Caihua
Shi, Hongyang
He, Xinyu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90
[50] Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Yao, Zhewei
Wu, Xiaoxia
Li, Cheng
Youn, Stephen
He, Yuxiong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19377 - 19385

← 1 2 3 4 5 →