Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

被引:6
|
作者
Ma, Yuexiao [1 ]
Li, Huixia [2 ]
Zheng, Xiawu [3 ]
Xiao, Xuefeng [2 ]
Wang, Rui [2 ]
Wen, Shilei [2 ]
Pan, Xin [2 ]
Chao, Fei [1 ]
Ji, Rongrong [1 ,4 ]
机构
[1] Xiamen Univ, Minist Educ China, Key Lab Multimedia Trusted Percept & Efficient Co, Sch Informat, Xiamen 361005, Peoples R China
[2] ByteDance Inc, Beijing, Peoples R China
[3] Peng Cheng Lab, Shenzhen, Peoples R China
[4] Xiamen Univ, Shenzhen Res Inst, Shenzhen, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00768
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2 x0.5.
引用
收藏
页码:7950 / 7959
页数:10
相关论文
共 50 条
  • [41] Hessian matrix-aware comprehensive post-training quantization for vision transformers
    Zhang, Weixing
    Tian, Zhuang
    Lin, Nan
    Yang, Cong
    Chen, Yongxia
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
  • [42] A novel framework for deployment of CNN models using post-training quantization on microcontroller
    Sailesh, M.
    Selvakumar, K.
    Prasanth, Narayanan
    MICROPROCESSORS AND MICROSYSTEMS, 2022, 94
  • [43] Towards Efficient Post-training Quantization of Pre-trained Language Models
    Bai, Haoli
    Hou, Lu
    Shang, Lifeng
    Jiang, Xin
    King, Irwin
    Lyu, Michael R.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [44] Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
    Lin, Chen
    Peng, Bo
    Li, Zheyang
    Tan, Wenming
    Ren, Ye
    Xiao, Jun
    Pu, Shiliang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16196 - 16205
  • [45] Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
    Zhong, Yunshan
    Huang, You
    Hu, Jiawei
    Zhang, Yuxin
    Ji, Rongrong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2676 - 2692
  • [46] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
    Li, Zhikai
    Xiao, Junrui
    Yang, Lianwei
    Gu, Qingyi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17181 - 17190
  • [47] Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression
    Shi, Junqi
    Lu, Ming
    Ma, Zhan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3082 - 3095
  • [48] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
    Liu, Yijiang
    Yang, Huanrui
    Dong, Zhen
    Keutzer, Kurt
    Du, Li
    Zhang, Shanghang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20321 - 20330
  • [49] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
    Liu, Caihua
    Shi, Hongyang
    He, Xinyu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90
  • [50] Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
    Yao, Zhewei
    Wu, Xiaoxia
    Li, Cheng
    Youn, Stephen
    He, Yuxiong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19377 - 19385