Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

被引:6
|
作者
Ma, Yuexiao [1 ]
Li, Huixia [2 ]
Zheng, Xiawu [3 ]
Xiao, Xuefeng [2 ]
Wang, Rui [2 ]
Wen, Shilei [2 ]
Pan, Xin [2 ]
Chao, Fei [1 ]
Ji, Rongrong [1 ,4 ]
机构
[1] Xiamen Univ, Minist Educ China, Key Lab Multimedia Trusted Percept & Efficient Co, Sch Informat, Xiamen 361005, Peoples R China
[2] ByteDance Inc, Beijing, Peoples R China
[3] Peng Cheng Lab, Shenzhen, Peoples R China
[4] Xiamen Univ, Shenzhen Res Inst, Shenzhen, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00768
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2 x0.5.
引用
收藏
页码:7950 / 7959
页数:10
相关论文
共 50 条
  • [11] POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER IN TRANSFORMED DOMAIN
    Feng, Kai
    Chen, Zhuo
    Gao, Fei
    Wang, Zhe
    Xu, Long
    Lin, Weisi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1457 - 1462
  • [12] Post-training Quantization Methods for Deep Learning Models
    Kluska, Piotr
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
  • [13] PTQD: Accurate Post-Training Quantization for Diffusion Models
    He, Yefei
    Liu, Luping
    Liu, Jing
    Wu, Weijia
    Zhou, Hong
    Zhuang, Bohan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [14] Normalized Post-training Quantization for Photonic Neural Networks
    Kirtas, M.
    Passalis, N.
    Oikonomou, A.
    Mourgias-Alexandris, G.
    Moralis-Pegios, M.
    Pleros, N.
    Tefas, A.
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 657 - 663
  • [15] Post-training Quantization for Neural Networks with Provable Guarantees*
    Zhang, Jinjie
    Zhou, Yixuan
    Saab, Rayan
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (02): : 373 - 399
  • [16] Non-uniform Step Size Quantization for Accurate Post-training Quantization
    Oh, Sangyun
    Sim, Hyeonuk
    Kim, Jounghyun
    Lee, Jongeun
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 658 - 673
  • [17] Stabilized activation scale estimation for precise Post-Training Quantization
    Hao, Zhenyang
    Wang, Xinggang
    Liu, Jiawei
    Yuan, Zhihang
    Yang, Dawei
    Liu, Wenyu
    NEUROCOMPUTING, 2024, 569
  • [18] POCA: Post-training Quantization with Temporal Alignment for Codec Avatars
    Meng, Jian
    Li, Yuecheng
    Li, Chenghui
    Sarwar, Syed Shakib
    Wang, Dilin
    Seo, Jae-sun
    COMPUTER VISION - ECCV 2024, PT XL, 2025, 15098 : 230 - 246
  • [19] Toward Accurate Post-Training Quantization for Image Super Resolution
    Tu, Zhijun
    Hu, Jie
    Chen, Hanting
    Wang, Yunhe
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5856 - 5865
  • [20] AQA: An Adaptive Post-Training Quantization Method for Activations of CNNs
    Wang, Yun
    Liu, Qiang
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (08) : 2025 - 2035