RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

被引：16

作者：

Li, Zhikai ^{[1
,2
]}

Xiao, Junrui ^{[1
,2
]}

Yang, Lianwei ^{[1
,2
]}

Gu, Qingyi ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICCV51070.2023.01580

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been presented; unfortunately, they typically suffer from non-trivial accuracy degradation, especially in lowbit cases. In this paper, we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale reparameterization, to address the above issues. RepQ-ViT decouples the quantization and inference processes, where the former employs complex quantizers and the latter employs scalereparameterized simplified quantizers. This ensures both accurate quantization and efficient inference, which distinguishes it from existing approaches that sacrifice quantization performance to meet the target hardware. More specifically, we focus on two components with extreme distributions: post-LayerNorm activations with severe interchannel variation and post-Softmax activations with powerlaw features, and initially apply channel-wise quantization and log 2 quantization, respectively. Then, we reparameterize the scales to hardware-friendly layer-wise quantization and log2 quantization for inference, with only slight accuracy or computational costs. Extensive experiments are conducted on multiple vision tasks with different model variants, proving that RepQ-ViT, without hyperparameters and expensive reconstruction procedures, can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level. Code is available at https://github.com/zkkli/RepQ-ViT.

引用

页码：17181 / 17190

页数：10

共 50 条

[1] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
Yuan, Zhihang
Xue, Chenhao
Chen, Yiqi
Wu, Qiang
Sun, Guangyu
COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
[2] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
Liu, Caihua
Shi, Hongyang
He, Xinyu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90
[3] ADFQ-ViT: Activation-Distribution-Friendly post-training Quantization for Vision Transformers
Jiang, Yanfeng
Sun, Ning
Xie, Xueshuo
Yang, Fei
Li, Tao
NEURAL NETWORKS, 2025, 186
[4] AGQB-ViT: Adaptive granularity quantizer with bias for post-training quantization of Vision Transformers
Huo, Ying
Kang, Yongqiang
Yang, Dawei
Zhu, Jiahao
NEUROCOMPUTING, 2025, 637
[5] AdaLog: Post-training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Wu, Zhuguanyu
Chen, Jiaxin
Zhong, Hanwen
Huang, Di
Wang, Yunhong
COMPUTER VISION - ECCV 2024, PT XXVII, 2025, 15085 : 411 - 427
[6] Post-Training Quantization for Vision Transformer
Liu, Zhenhua
Wang, Yunhe
Han, Kai
Zhang, Wei
Ma, Siwei
Gao, Wen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[7] Hessian matrix-aware comprehensive post-training quantization for vision transformers
Zhang, Weixing
Tian, Zhuang
Lin, Nan
Yang, Cong
Chen, Yongxia
JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
[8] Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
Zhong, Yunshan
Huang, You
Hu, Jiawei
Zhang, Yuxin
Ji, Rongrong
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2676 - 2692
[9] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Yao, Zhewei
Aminabadi, Reza Yazdani
Zhang, Minjia
Wu, Xiaoxia
Li, Conglong
He, Yuxiong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[10] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Liu, Yijiang
Yang, Huanrui
Dong, Zhen
Keutzer, Kurt
Du, Li
Zhang, Shanghang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20321 - 20330

← 1 2 3 4 5 →