Post-training quantization for re-parameterization via coarse & fine weight splitting

被引:1
|
作者
Yang, Dawei [1 ]
He, Ning [2 ,3 ]
Hu, Xing [2 ]
Yuan, Zhihang [2 ]
Yu, Jiangyong [2 ]
Xu, Chen [2 ]
Jiang, Zhe [3 ,4 ]
机构
[1] Nanjing Inst Technol, Sch Comp Engn, Nanjing, Peoples R China
[2] Houmo AI, Nanjing, Peoples R China
[3] Southeast Univ, Nanjing, Peoples R China
[4] Univ Cambridge, Cambridge, England
基金
瑞典研究理事会;
关键词
PTQ; CNN; Quantization;
D O I
10.1016/j.sysarc.2024.103065
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Although neural networks have made remarkable advancements in various applications, they require substantial computational and memory resources. Network quantization is a powerful technique to compress neural networks, allowing for more efficient and scalable AI deployments. Recently, Re -parameterization has emerged as a promising technique to enhance model performance while simultaneously alleviating the computational burden in various computer vision tasks. However, the accuracy drops significantly when applying quantization on the re -parameterized networks. We identify that the primary challenge arises from the large variation in weight distribution across the original branches. To address this issue, we propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight, and develop an improved KL metric to determine optimal quantization scales for activation. To the best of our knowledge, our approach is the first work that enables post -training quantization applicable on re -parameterized networks. For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss. The code is in https: //github.com/NeonHo/Coarse-Fine-Weight-Split.git
引用
收藏
页数:9
相关论文
共 50 条
  • [1] DyRep: Bootstrapping Training with Dynamic Re-parameterization
    Huang, Tao
    You, Shan
    Zhang, Bohan
    Du, Yuxuan
    Wang, Fei
    Qian, Chen
    Xu, Chang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 578 - 587
  • [2] Fine-grained Data Distribution Alignment for Post-Training Quantization
    Zhong, Yunshan
    Lin, Mingbao
    Chen, Mengzhao
    Li, Ke
    Shen, Yunhang
    Chao, Fei
    Wu, Yongjian
    Ji, Rongrong
    [J]. COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 70 - 86
  • [3] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    [J]. Machine Learning, 2021, 110 : 3245 - 3262
  • [4] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [6] Attention Round for post-training quantization
    Diao, Huabin
    Li, Gongyan
    Xu, Shaoyun
    Kong, Chao
    Wang, Wei
    [J]. NEUROCOMPUTING, 2024, 565
  • [7] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    [J]. MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
  • [8] Easy Pruning via Coresets and Structural Re-Parameterization
    Yin, Wenfeng
    Dong, Gang
    An, Dianzheng
    Zhao, Yaqian
    Wang, Binqiang
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1725 - 1729
  • [9] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
    Liu, Caihua
    Shi, Hongyang
    He, Xinyu
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90