Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

被引：6

作者：

Ma, Yuexiao ^{[1
]}

Li, Huixia ^{[2
]}

Zheng, Xiawu ^{[3
]}

Xiao, Xuefeng ^{[2
]}

Wang, Rui ^{[2
]}

Wen, Shilei ^{[2
]}

Pan, Xin ^{[2
]}

Chao, Fei ^{[1
]}

Ji, Rongrong ^{[1
,4
]}

机构：

[1] Xiamen Univ, Minist Educ China, Key Lab Multimedia Trusted Percept & Efficient Co, Sch Informat, Xiamen 361005, Peoples R China

[2] ByteDance Inc, Beijing, Peoples R China

[3] Peng Cheng Lab, Shenzhen, Peoples R China

[4] Xiamen Univ, Shenzhen Res Inst, Shenzhen, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52729.2023.00768

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2 x0.5.

引用

页码：7950 / 7959

页数：10

共 50 条

[31] Enhancing Learning Transfer Through Post-Training Activities
Scott, W. S.
McGraw, L.
Sauer, T.
Belton, H.
Bittner, S.
Lynn, D.
Sharrer, J.
Speicher, T.
TRANSFUSION, 2011, 51 : 270A - 271A
[32] Linear Domain-aware Log-scale Post-training Quantization
Kim, Sungrae
Kim, Hyun
2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2021,
[33] Hybrid Post-Training Quantization for Super-Resolution Neural Network Compression
Xu, Naijie
Chen, Xiaohui
Cao, Youlong
Zhang, Wenyi
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 379 - 383
[34] Two Novel Non-Uniform Quantizers with Application in Post-Training Quantization
Peric, Zoran
Aleksic, Danijela
Nikolic, Jelena
Tomic, Stefan
MATHEMATICS, 2022, 10 (19)
[35] Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision
Liu, Xinghcao
Ye, Mao
Zhou, Dengyong
Liu, Qiang
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8697 - 8705
[36] Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection
Fan, Yunqian
Wei, Xiuying
Gong, Ruihao
Ma, Yuqing
Zhang, Xiangguo
Zhang, Qi
Liu, Xianglong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 11936 - 11943
[37] Optimization-Based Post-Training Quantization With Bit-Split and Stitching
Wang, Peisong
Chen, Weihan
He, Xiangyu
Chen, Qiang
Liu, Qingshan
Cheng, Jian
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2119 - 2135
[38] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Yao, Zhewei
Aminabadi, Reza Yazdani
Zhang, Minjia
Wu, Xiaoxia
Li, Conglong
He, Yuxiong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[39] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
Liu, Jiawei
Niu, Lin
Yuan, Zhihang
Yang, Dawei
Wang, Xinggang
Liu, Wenyu
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24427 - 24437
[40] Z-FOLD: A Frustratingly Easy Post-Training Quantization Scheme for LLMs
Jeon, Yongkweon
Lee, Chungman
Park, Kyungphil
Kim, Ho-young
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14446 - 14461

← 1 2 3 4 5 →