RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

被引:16
|
作者
Li, Zhikai [1 ,2 ]
Xiao, Junrui [1 ,2 ]
Yang, Lianwei [1 ,2 ]
Gu, Qingyi [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been presented; unfortunately, they typically suffer from non-trivial accuracy degradation, especially in lowbit cases. In this paper, we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale reparameterization, to address the above issues. RepQ-ViT decouples the quantization and inference processes, where the former employs complex quantizers and the latter employs scalereparameterized simplified quantizers. This ensures both accurate quantization and efficient inference, which distinguishes it from existing approaches that sacrifice quantization performance to meet the target hardware. More specifically, we focus on two components with extreme distributions: post-LayerNorm activations with severe interchannel variation and post-Softmax activations with powerlaw features, and initially apply channel-wise quantization and log 2 quantization, respectively. Then, we reparameterize the scales to hardware-friendly layer-wise quantization and log2 quantization for inference, with only slight accuracy or computational costs. Extensive experiments are conducted on multiple vision tasks with different model variants, proving that RepQ-ViT, without hyperparameters and expensive reconstruction procedures, can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level. Code is available at https://github.com/zkkli/RepQ-ViT.
引用
收藏
页码:17181 / 17190
页数:10
相关论文
共 50 条
  • [1] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
    Yuan, Zhihang
    Xue, Chenhao
    Chen, Yiqi
    Wu, Qiang
    Sun, Guangyu
    COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
  • [2] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
    Liu, Caihua
    Shi, Hongyang
    He, Xinyu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90
  • [3] ADFQ-ViT: Activation-Distribution-Friendly post-training Quantization for Vision Transformers
    Jiang, Yanfeng
    Sun, Ning
    Xie, Xueshuo
    Yang, Fei
    Li, Tao
    NEURAL NETWORKS, 2025, 186
  • [4] AGQB-ViT: Adaptive granularity quantizer with bias for post-training quantization of Vision Transformers
    Huo, Ying
    Kang, Yongqiang
    Yang, Dawei
    Zhu, Jiahao
    NEUROCOMPUTING, 2025, 637
  • [5] AdaLog: Post-training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
    Wu, Zhuguanyu
    Chen, Jiaxin
    Zhong, Hanwen
    Huang, Di
    Wang, Yunhong
    COMPUTER VISION - ECCV 2024, PT XXVII, 2025, 15085 : 411 - 427
  • [6] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Hessian matrix-aware comprehensive post-training quantization for vision transformers
    Zhang, Weixing
    Tian, Zhuang
    Lin, Nan
    Yang, Cong
    Chen, Yongxia
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
  • [8] Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
    Zhong, Yunshan
    Huang, You
    Hu, Jiawei
    Zhang, Yuxin
    Ji, Rongrong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2676 - 2692
  • [9] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
    Yao, Zhewei
    Aminabadi, Reza Yazdani
    Zhang, Minjia
    Wu, Xiaoxia
    Li, Conglong
    He, Yuxiong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
    Liu, Yijiang
    Yang, Huanrui
    Dong, Zhen
    Keutzer, Kurt
    Du, Li
    Zhang, Shanghang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20321 - 20330