Model-based Weight Quantization for Convolutional Neural Network Compression

被引:0
|
作者
Gheorghe, Stefan [1 ]
Ivanovici, Mihai [1 ]
机构
[1] Transilvania Univ Brasov, Elect & Comp Dept, Brasov, Romania
来源
2021 16TH INTERNATIONAL CONFERENCE ON ENGINEERING OF MODERN ELECTRIC SYSTEMS (EMES) | 2021年
关键词
weight quantization; convolutional neural network; CNN compression;
D O I
10.1109/EMES52337.2021.9484143
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
Nowadays convolutional neural networks have many applications, from autonomous driving to the medical field. The need for fast deployment of the algorithms, reduced latency, while maintaining a reduced cost created a new paradigm - edge computing - to bring the computational power closer to the user. One limitation for the implementation of convolutional neural networks on edge devices is the large number of parameters, implying high memory usage. As the memory constraints are very strict on an edge device, the need to reduce the memory usage of convolutional neural networks arises. One way is to perform network weights quantization. We propose an adaptive modelbased quantization method, with a parameterizable number of quantization intervals. Using the double exponential probability density function, the quantization intervals are determined by taking into account the histogram of trained network weights. For a hardware implementation, we modified a classical convolutional network architecture by replacing fully connected layers with convolutional layers, and changing activation functions. We performed a comparison of the performance of several convolutional network architectures, while the proposed quantization method was applied over the trained network weights. We present experimental results in terms of both accuracy and network size, considering the unquantized network and the quantized network using the proposed method and a non-adaptive one. Proposed method obtained a reduction of the model size to a quarter, in some cases without losing accuracy.
引用
收藏
页码:94 / 97
页数:4
相关论文
共 50 条
  • [1] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
    Sasikumar, A.
    Ravi, Logesh
    Kotecha, Ketan
    Indragandhi, V
    Subramaniyaswamy, V
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
  • [2] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
    A, Sasikumar
    Ravi, Logesh
    Kotecha, Ketan
    V, Indragandhi
    V, Subramaniyaswamy
    Computers and Electrical Engineering, 2022, 102
  • [3] Physical Limitation Aware Quantization Model for Photonic Convolutional Neural Network
    Jiang, Yue
    Zhang, Wenjia
    Yang, Fan
    He, Zuyuan
    2021 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2021,
  • [4] Deep Neural Network Compression Method Based on Product Quantization
    Fang, Xiuqin
    Liu, Han
    Xie, Guo
    Zhang, Youmin
    Liu, Ding
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7035 - 7040
  • [5] Max-Variance Convolutional Neural Network Model Compression
    Boone-Sifuentes, Tanya
    Robles-Kelly, Antonio
    Nazari, Asef
    2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
  • [6] Model-based convolutional neural network approach to underwater source-range estimation
    Chen, R.
    Schmidt, H.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (01): : 405 - 420
  • [7] Model-based Fault Detection and Diagnosis for HVAC Systems Using Convolutional Neural Network
    Miyata, Shohei
    Akashi, Yasunori
    Lim, Jongyeon
    Kuwahara, Yasuhiro
    Tanaka, Katsuhiko
    PROCEEDINGS OF BUILDING SIMULATION 2019: 16TH CONFERENCE OF IBPSA, 2020, : 853 - 860
  • [8] Convolutional Neural Network Accelerator with Vector Quantization
    Lee, Heng
    Wu, Yi-Heng
    Lin, Yu-Sheng
    Chien, Shao-Yi
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [9] QNet: An Adaptive Quantization Table Generator Based on Convolutional Neural Network
    Yan, Xiao
    Fan, Yibo
    Chen, Kewei
    Yu, Xulin
    Zeng, Xiaoyang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9654 - 9664
  • [10] Neural Network Language Model Compression With Product Quantization and Soft Binarization
    Yu, Kai
    Ma, Rao
    Shi, Kaiyu
    Liu, Qi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2438 - 2449