Deep Neural Network Quantization via Layer-Wise Optimization Using Limited Training Data

被引:0
|
作者
Chen, Shangyu [1 ]
Wang, Wenya [1 ]
Pan, Sinno Jialin [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The advancement of deep models poses great challenges to real-world deployment because of the limited computational ability and storage space on edge devices. To solve this problem, existing works have made progress to prune or quantize deep models. However, most existing methods rely heavily on a supervised training process to achieve satisfactory performance, acquiring large amount of labeled training data, which may not be practical for real deployment. In this paper, we propose a novel layer-wise quantization method for deep neural networks, which only requires limited training data (1% of original dataset). Specifically, we formulate parameters quantization for each layer as a discrete optimization problem, and solve it using Alternative Direction Method of Multipliers (ADMM), which gives an efficient closed-form solution. We prove that the final performance drop after quantization is bounded by a linear combination of the reconstructed errors caused at each layer. Based on the proved theorem, we propose an algorithm to quantize a deep neural network layer by layer with an additional weights update step to minimize the final error. Extensive experiments on benchmark deep models are conducted to demonstrate the effectiveness of our proposed method using 1% of CIFAR10 and ImageNet datasets. Codes are available in: https://github.com/csyhhu/L-DNQ
引用
收藏
页码:3329 / 3336
页数:8
相关论文
共 50 条
  • [1] LAYER-WISE DEEP NEURAL NETWORK PRUNING VIA ITERATIVELY REWEIGHTED OPTIMIZATION
    Jiang, Tao
    Yang, Xiangyu
    Shi, Yuanming
    Wang, Hao
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5606 - 5610
  • [2] Post-training deep neural network pruning via layer-wise calibration
    Lazarevich, Ivan
    Kozlov, Alexander
    Malinin, Nikita
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 798 - 805
  • [3] Mixed-Precision Neural Network Quantization via Learned Layer-Wise Importance
    Tang, Chen
    Ouyang, Kai
    Wang, Zhi
    Zhu, Yifei
    Ji, Wen
    Wang, Yaowei
    Zhu, Wenwu
    [J]. COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 259 - 275
  • [4] Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training
    Chen, Ling-Hui
    Ling, Zhen-Hua
    Liu, Li-Juan
    Dai, Li-Rong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1859 - 1872
  • [5] Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization
    Bassi, Pedro R. A. S.
    Dertkigil, Sergio S. J.
    Cavalli, Andrea
    [J]. NATURE COMMUNICATIONS, 2024, 15 (01)
  • [6] Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization
    Pedro R. A. S. Bassi
    Sergio S. J. Dertkigil
    Andrea Cavalli
    [J]. Nature Communications, 15
  • [7] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
    Zhou, Yefan
    Pang, Tianyu
    Liu, Keqin
    Martin, Charles H.
    Mahoney, Michael W.
    Yang, Yaoqing
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Explaining Deep Neural Network using Layer-wise Relevance Propagation and Integrated Gradients
    Cik, Ivan
    Rasamoelina, Andrindrasana David
    Mach, Marian
    Sincak, Peter
    [J]. 2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 381 - 386
  • [9] SPSA for Layer-Wise Training of Deep Networks
    Wulff, Benjamin
    Schuecker, Jannis
    Bauckhage, Christian
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT III, 2018, 11141 : 564 - 573
  • [10] REINFORCEMENT LEARNING-BASED LAYER-WISE QUANTIZATION FOR LIGHTWEIGHT DEEP NEURAL NETWORKS
    Jung, Juri
    Kim, Jonghee
    Kim, Youngeun
    Kim, Changick
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 3070 - 3074