Post-training quantization for re-parameterization via coarse & fine weight splitting

被引：1

作者：

Yang, Dawei ^{[1
]}

He, Ning ^{[2
,3
]}

Hu, Xing ^{[2
]}

Yuan, Zhihang ^{[2
]}

Yu, Jiangyong ^{[2
]}

Xu, Chen ^{[2
]}

Jiang, Zhe ^{[3
,4
]}

机构：

[1] Nanjing Inst Technol, Sch Comp Engn, Nanjing, Peoples R China

[2] Houmo AI, Nanjing, Peoples R China

[3] Southeast Univ, Nanjing, Peoples R China

[4] Univ Cambridge, Cambridge, England

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2024年 / 147卷

基金：

瑞典研究理事会;

关键词：

PTQ; CNN; Quantization;

D O I：

10.1016/j.sysarc.2024.103065

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although neural networks have made remarkable advancements in various applications, they require substantial computational and memory resources. Network quantization is a powerful technique to compress neural networks, allowing for more efficient and scalable AI deployments. Recently, Re -parameterization has emerged as a promising technique to enhance model performance while simultaneously alleviating the computational burden in various computer vision tasks. However, the accuracy drops significantly when applying quantization on the re -parameterized networks. We identify that the primary challenge arises from the large variation in weight distribution across the original branches. To address this issue, we propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight, and develop an improved KL metric to determine optimal quantization scales for activation. To the best of our knowledge, our approach is the first work that enables post -training quantization applicable on re -parameterized networks. For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss. The code is in https: //github.com/NeonHo/Coarse-Fine-Weight-Split.git

引用

页数：9

共 50 条

[1] DyRep: Bootstrapping Training with Dynamic Re-parameterization
Huang, Tao
You, Shan
Zhang, Bohan
Du, Yuxuan
Wang, Fei
Qian, Chen
Xu, Chang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 578 - 587
[2] Fine-grained Data Distribution Alignment for Post-Training Quantization
Zhong, Yunshan
Lin, Mingbao
Chen, Mengzhao
Li, Ke
Shen, Yunhang
Chao, Fei
Wu, Yongjian
Ji, Rongrong
[J]. COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 70 - 86
[3] Loss aware post-training quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
Alex M. Bronstein
Avi Mendelson
[J]. Machine Learning, 2021, 110 : 3245 - 3262
[4] Post-Training Quantization for Vision Transformer
Liu, Zhenhua
Wang, Yunhe
Han, Kai
Zhang, Wei
Ma, Siwei
Gao, Wen
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[5] Post-training Quantization on Diffusion Models
Shang, Yuzhang
Yuan, Zhihang
Xie, Bin
Wu, Bingzhe
Yan, Yan
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
[6] Attention Round for post-training quantization
Diao, Huabin
Li, Gongyan
Xu, Shaoyun
Kong, Chao
Wang, Wei
[J]. NEUROCOMPUTING, 2024, 565
[7] Loss aware post-training quantization
Nahshan, Yury
Chmiel, Brian
Baskin, Chaim
Zheltonozhskii, Evgenii
Banner, Ron
Bronstein, Alex M.
Mendelson, Avi
[J]. MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
[8] Easy Pruning via Coresets and Structural Re-Parameterization
Yin, Wenfeng
Dong, Gang
An, Dianzheng
Zhao, Yaqian
Wang, Binqiang
[J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1725 - 1729
[9] Post-Training Sparsity-Aware Quantization
Shomron, Gil
Gabbay, Freddy
Kurzum, Samer
Weiser, Uri
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] FGPTQ-ViT: Fine-Grained Post-training Quantization for Vision Transformers
Liu, Caihua
Shi, Hongyang
He, Xinyu
[J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 79 - 90

← 1 2 3 4 5 →