Hessian matrix-aware comprehensive post-training quantization for vision transformers

被引:0
|
作者
Zhang, Weixing [1 ]
Tian, Zhuang [1 ]
Lin, Nan [1 ]
Yang, Cong [1 ]
Chen, Yongxia [1 ]
机构
[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou, Peoples R China
关键词
model quantization; post-training quantization; image classification; Hessian matrix; vision transformer;
D O I
10.1117/1.JEI.34.1.013009
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, vision transformers (ViTs) have made significant breakthroughs in computer vision and have demonstrated great potential in large-scale models. However, the quantization methods for convolutional neural network models do not perform well on ViTs models, leading to a significant decrease in accuracy when applied to ViTs models. We extend the quantization parameter optimization method based on the Hessian matrix and apply it to the quantization of the LayerNorm module in ViT models. This approach reduces the impact of quantization on task accuracy for the LayerNorm module and enables more comprehensive quantization of ViT models. To achieve fast quantization of ViTs models, we propose a quantization framework specifically designed for ViTs models: Hessian matrix-aware post-training quantization for vision transformers (HAPTQ). The experimental results on various models and datasets demonstrate that our HAPTQ method, after quantizing the LayerNorm module of various ViT models, can achieve lossless quantization (with an accuracy drop of less than 1%) in ImageNet classification tasks. Specifically, the HAPTQ method achieves 85.81% top-1 accuracy on the ViT-L model.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] AdaLog: Post-training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
    Wu, Zhuguanyu
    Chen, Jiaxin
    Zhong, Hanwen
    Huang, Di
    Wang, Yunhong
    COMPUTER VISION - ECCV 2024, PT XXVII, 2025, 15085 : 411 - 427
  • [2] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [3] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
  • [4] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
    Yuan, Zhihang
    Xue, Chenhao
    Chen, Yiqi
    Wu, Qiang
    Sun, Guangyu
    COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
  • [6] Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
    Zhong, Yunshan
    Huang, You
    Hu, Jiawei
    Zhang, Yuxin
    Ji, Rongrong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (04) : 2676 - 2692
  • [7] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
    Li, Zhikai
    Xiao, Junrui
    Yang, Lianwei
    Gu, Qingyi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17181 - 17190
  • [8] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [9] NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
    Liu, Yijiang
    Yang, Huanrui
    Dong, Zhen
    Keutzer, Kurt
    Du, Li
    Zhang, Shanghang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20321 - 20330
  • [10] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
    Liu, Caihua
    Shi, Hongyang
    He, Xinyu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90