Attention Round for post-training quantization

被引:7
|
作者
Diao, Huabin [1 ]
Li, Gongyan [2 ]
Xu, Shaoyun [2 ]
Kong, Chao [1 ]
Wang, Wei [1 ]
机构
[1] Anhui Polytech Univ, Beijing Middle Rd, Wuhu 241000, Anhui, Peoples R China
[2] Chinese Acad Sci, Inst Microelect, 3 Beituocheng West Rd, Beijing 100029, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Convolutional neural networks; Post-training quantization; Attention Round; Mixed precision;
D O I
10.1016/j.neucom.2023.127012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quantization methods for convolutional neural network models can be broadly categorized into post-training quantization (PTQ) and quantization aware training (QAT). While PTQ offers the advantage of requiring only a small portion of the data for quantization, the resulting quantized model may not be as effective as QAT. To address this limitation, this paper proposes a novel quantization function named Attention Round. Unlike traditional quantization function that map 32 bit floating-point value w to nearby quantization levels, Attention Round allows w to be mapped to all possible quantization levels in the entire quantization space, expanding the quantization optimization space. The possibilities of mapping w to different quantization levels are inversely correlated with the distance between w and the quantization levels, regulated by a Gaussian decay function. Furthermore, to tackle the challenge of mixed precision quantization, this paper introduces a lossy coding length measure to assign quantization precision to different layers of the model, eliminating the need for solving a combinatorial optimization problem. Experimental evaluations on various models demonstrate the effectiveness of the proposed method. Notably, for ResNet18 and MobileNetV2, the PTQ approach achieves comparable quantization performance to QAT while utilizing only 1024 training data and 10 min for the quantization process.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Loss aware post-training quantization
    Yury Nahshan
    Brian Chmiel
    Chaim Baskin
    Evgenii Zheltonozhskii
    Ron Banner
    Alex M. Bronstein
    Avi Mendelson
    Machine Learning, 2021, 110 : 3245 - 3262
  • [2] Post-Training Quantization for Vision Transformer
    Liu, Zhenhua
    Wang, Yunhe
    Han, Kai
    Zhang, Wei
    Ma, Siwei
    Gao, Wen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Post-training Quantization on Diffusion Models
    Shang, Yuzhang
    Yuan, Zhihang
    Xie, Bin
    Wu, Bingzhe
    Yan, Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1972 - 1981
  • [4] Loss aware post-training quantization
    Nahshan, Yury
    Chmiel, Brian
    Baskin, Chaim
    Zheltonozhskii, Evgenii
    Banner, Ron
    Bronstein, Alex M.
    Mendelson, Avi
    MACHINE LEARNING, 2021, 110 (11-12) : 3245 - 3262
  • [5] Post-Training Sparsity-Aware Quantization
    Shomron, Gil
    Gabbay, Freddy
    Kurzum, Samer
    Weiser, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Towards accurate post-training quantization for reparameterized models
    Zhang, Luoming
    He, Yefei
    Fei, Wen
    Lou, Zhenyu
    Wu, Weijia
    Ying, Yangwei
    Zhou, Hong
    APPLIED INTELLIGENCE, 2025, 55 (07)
  • [7] Towards Accurate Post-Training Quantization for Vision Transformer
    Ding, Yifu
    Qin, Haotong
    Yan, Qinghua
    Chai, Zhenhua
    Liu, Junjie
    Wei, Xiaolin
    Liu, Xianglong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5380 - 5388
  • [8] Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization
    Chu, Tianshu
    Yang, Zuopeng
    Huang, Xiaolin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 3056 - 3060
  • [9] Post-training Quantization of Deep Neural Network Weights
    Khayrov, E. M.
    Malsagov, M. Yu.
    Karandashev, I. M.
    ADVANCES IN NEURAL COMPUTATION, MACHINE LEARNING, AND COGNITIVE RESEARCH III, 2020, 856 : 230 - 238
  • [10] POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER IN TRANSFORMED DOMAIN
    Feng, Kai
    Chen, Zhuo
    Gao, Fei
    Wang, Zhe
    Xu, Long
    Lin, Weisi
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1457 - 1462