RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

被引:16
|
作者
Li, Zhikai [1 ,2 ]
Xiao, Junrui [1 ,2 ]
Yang, Lianwei [1 ,2 ]
Gu, Qingyi [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been presented; unfortunately, they typically suffer from non-trivial accuracy degradation, especially in lowbit cases. In this paper, we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale reparameterization, to address the above issues. RepQ-ViT decouples the quantization and inference processes, where the former employs complex quantizers and the latter employs scalereparameterized simplified quantizers. This ensures both accurate quantization and efficient inference, which distinguishes it from existing approaches that sacrifice quantization performance to meet the target hardware. More specifically, we focus on two components with extreme distributions: post-LayerNorm activations with severe interchannel variation and post-Softmax activations with powerlaw features, and initially apply channel-wise quantization and log 2 quantization, respectively. Then, we reparameterize the scales to hardware-friendly layer-wise quantization and log2 quantization for inference, with only slight accuracy or computational costs. Extensive experiments are conducted on multiple vision tasks with different model variants, proving that RepQ-ViT, without hyperparameters and expensive reconstruction procedures, can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level. Code is available at https://github.com/zkkli/RepQ-ViT.
引用
收藏
页码:17181 / 17190
页数:10
相关论文
共 50 条
  • [11] Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
    Shi, Huihong
    Shao, Haikuo
    Mao, Wendong
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2025, 72 (03) : 1296 - 1307
  • [12] Towards Accurate Post-Training Quantization for Vision Transformer
    Ding, Yifu
    Qin, Haotong
    Yan, Qinghua
    Chai, Zhenhua
    Liu, Junjie
    Wei, Xiaolin
    Liu, Xianglong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5380 - 5388
  • [13] POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER IN TRANSFORMED DOMAIN
    Feng, Kai
    Chen, Zhuo
    Gao, Fei
    Wang, Zhe
    Xu, Long
    Lin, Weisi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1457 - 1462
  • [14] P2 -ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
    Shi, Huihong
    Cheng, Xin
    Mao, Wendong
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (09) : 1704 - 1717
  • [15] Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers With Bridge Block Reconstruction for IoT Systems
    Lee, Jemin
    Kwon, Yongin
    Park, Sihyeong
    Yu, Misun
    Park, Jeman
    Song, Hwanjun
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (22): : 36384 - 36396
  • [16] Stabilized activation scale estimation for precise Post-Training Quantization
    Hao, Zhenyang
    Wang, Xinggang
    Liu, Jiawei
    Yuan, Zhihang
    Yang, Dawei
    Liu, Wenyu
    NEUROCOMPUTING, 2024, 569
  • [17] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [18] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [19] Attention Round for post-training quantization
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    NEUROCOMPUTING, 2024, 565
  • [20] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262