RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

被引:16
|
作者
Li, Zhikai [1 ,2 ]
Xiao, Junrui [1 ,2 ]
Yang, Lianwei [1 ,2 ]
Gu, Qingyi [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique. Recently, several PTQ schemes for vision transformers (ViTs) have been presented; unfortunately, they typically suffer from non-trivial accuracy degradation, especially in lowbit cases. In this paper, we propose RepQ-ViT, a novel PTQ framework for ViTs based on quantization scale reparameterization, to address the above issues. RepQ-ViT decouples the quantization and inference processes, where the former employs complex quantizers and the latter employs scalereparameterized simplified quantizers. This ensures both accurate quantization and efficient inference, which distinguishes it from existing approaches that sacrifice quantization performance to meet the target hardware. More specifically, we focus on two components with extreme distributions: post-LayerNorm activations with severe interchannel variation and post-Softmax activations with powerlaw features, and initially apply channel-wise quantization and log 2 quantization, respectively. Then, we reparameterize the scales to hardware-friendly layer-wise quantization and log2 quantization for inference, with only slight accuracy or computational costs. Extensive experiments are conducted on multiple vision tasks with different model variants, proving that RepQ-ViT, without hyperparameters and expensive reconstruction procedures, can outperform existing strong baselines and encouragingly improve the accuracy of 4-bit PTQ of ViTs to a usable level. Code is available at https://github.com/zkkli/RepQ-ViT.
引用
收藏
页码:17181 / 17190
页数:10
相关论文
共 50 条
  • [21] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-training Quantization of ViTs
    Ramachandran, Akshat
    Kundu, Souvik
    Krishna, Tushar
    COMPUTER VISION - ECCV 2024, PT LXVII, 2025, 15125 : 307 - 325
  • [22] A Fast Post-Training Pruning Framework for Transformers
    Kwon, Woosuk
    Kim, Sehoon
    Mahoney, Michael W.
    Hassoun, Joseph
    Keutzer, Kurt
    Gholami, Amir
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [23] Linear Domain-aware Log-scale Post-training Quantization
    Kim, Sungrae
    Kim, Hyun
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2021,
  • [24] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [25] Towards accurate post-training quantization for reparameterized models
    Zhang, Luoming
    He, Yefei
    Fei, Wen
    Lou, Zhenyu
    Wu, Weijia
    Ying, Yangwei
    Zhou, Hong
    APPLIED INTELLIGENCE, 2025, 55 (07)
  • [26] Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization
    Chu, Tianshu
    Yang, Zuopeng
    Huang, Xiaolin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 3056 - 3060
  • [27] MSQuant: Efficient Post-Training Quantization for Object Detection via Migration Scale Search
    Jiang, Zhesheng
    Li, Chao
    Qu, Tao
    He, Chu
    Wang, Dingwen
    ELECTRONICS, 2025, 14 (03):
  • [28] Post-training Quantization of Deep Neural Network Weights
    Khayrov, E. M.
    Malsagov, M. Yu.
    Karandashev, I. M.
    ADVANCES IN NEURAL COMPUTATION, MACHINE LEARNING, AND COGNITIVE RESEARCH III, 2020, 856 : 230 - 238
  • [29] Post-training Quantization Methods for Deep Learning Models
    Kluska, Piotr
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 467 - 479
  • [30] PTQD: Accurate Post-Training Quantization for Diffusion Models
    He, Yefei
    Liu, Luping
    Liu, Jing
    Wu, Weijia
    Zhou, Hong
    Zhuang, Bohan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,