Post-Training Quantization for Vision Transformer

被引:0
|
作者
Liu, Zhenhua [1 ,2 ]
Wang, Yunhe [2 ]
Han, Kai [2 ]
Zhang, Wei [2 ]
Ma, Siwei [1 ,3 ]
Gao, Wen [1 ,3 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Beijing, Peoples R China
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
[3] Peng Cheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, transformer has achieved remarkable performance on a variety of computer vision applications. Compared with mainstream convolutional neural networks, vision transformers are often of sophisticated architectures for extracting powerful feature representations, which are more difficult to be developed on mobile devices. In this paper, we present an effective post-training quantization algorithm for reducing the memory storage and computational costs of vision transformers. Basically, the quantization task can be regarded as finding the optimal low-bit quantization intervals for weights and inputs, respectively. To preserve the functionality of the attention mechanism, we introduce a ranking loss into the conventional quantization objective that aims to keep the relative order of the self-attention results after quantization. Moreover, we thoroughly analyze the relationship between quantization loss of different layers and the feature diversity, and explore a mixed-precision quantization scheme by exploiting the nuclear norm of each attention map and output feature. The effectiveness of the proposed method is verified on several benchmark models and datasets, which outperforms the state-of-the-art post-training quantization algorithms. For instance, we can obtain an 81.29% top-1 accuracy using DeiT-B model on ImageNet dataset with about 8-bit quantization. Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/VTPTQ.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Towards Accurate Post-Training Quantization for Vision Transformer
    Ding, Yifu
    Qin, Haotong
    Yan, Qinghua
    Chai, Zhenhua
    Liu, Junjie
    Wei, Xiaolin
    Liu, Xianglong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5380 - 5388
  • [2] POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER IN TRANSFORMED DOMAIN
    Feng, Kai
    Chen, Zhuo
    Gao, Fei
    Wang, Zhe
    Xu, Long
    Lin, Weisi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1457 - 1462
  • [3] Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
    Shi, Huihong
    Shao, Haikuo
    Mao, Wendong
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2025, 72 (03) : 1296 - 1307
  • [4] AdaLog: Post-training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
    Wu, Zhuguanyu
    Chen, Jiaxin
    Zhong, Hanwen
    Huang, Di
    Wang, Yunhong
    COMPUTER VISION - ECCV 2024, PT XXVII, 2025, 15085 : 411 - 427
  • [5] PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization
    Yuan, Zhihang
    Xue, Chenhao
    Chen, Yiqi
    Wu, Qiang
    Sun, Guangyu
    COMPUTER VISION, ECCV 2022, PT XII, 2022, 13672 : 191 - 207
  • [6] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [7] P2 -ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
    Shi, Huihong
    Cheng, Xin
    Mao, Wendong
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (09) : 1704 - 1717
  • [8] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [9] Attention Round for post-training quantization
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    NEUROCOMPUTING, 2024, 565
  • [10] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262