Model-based Weight Quantization for Convolutional Neural Network Compression

被引：0

作者：

Gheorghe, Stefan ^{[1
]}

Ivanovici, Mihai ^{[1
]}

机构：

[1] Transilvania Univ Brasov, Elect & Comp Dept, Brasov, Romania

来源：

2021 16TH INTERNATIONAL CONFERENCE ON ENGINEERING OF MODERN ELECTRIC SYSTEMS (EMES) | 2021年

关键词：

weight quantization; convolutional neural network; CNN compression;

D O I：

10.1109/EMES52337.2021.9484143

中图分类号：

TE [石油、天然气工业]; TK [能源与动力工程];

学科分类号：

0807 ; 0820 ;

摘要：

Nowadays convolutional neural networks have many applications, from autonomous driving to the medical field. The need for fast deployment of the algorithms, reduced latency, while maintaining a reduced cost created a new paradigm - edge computing - to bring the computational power closer to the user. One limitation for the implementation of convolutional neural networks on edge devices is the large number of parameters, implying high memory usage. As the memory constraints are very strict on an edge device, the need to reduce the memory usage of convolutional neural networks arises. One way is to perform network weights quantization. We propose an adaptive modelbased quantization method, with a parameterizable number of quantization intervals. Using the double exponential probability density function, the quantization intervals are determined by taking into account the histogram of trained network weights. For a hardware implementation, we modified a classical convolutional network architecture by replacing fully connected layers with convolutional layers, and changing activation functions. We performed a comparison of the performance of several convolutional network architectures, while the proposed quantization method was applied over the trained network weights. We present experimental results in terms of both accuracy and network size, considering the unquantized network and the quantized network using the proposed method and a non-adaptive one. Proposed method obtained a reduction of the model size to a quarter, in some cases without losing accuracy.

引用

页码：94 / 97

页数：4

共 50 条

[1] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
Sasikumar, A.
Ravi, Logesh
Kotecha, Ketan
Indragandhi, V
Subramaniyaswamy, V
COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
[2] Reconfigurable and hardware efficient adaptive quantization model-based accelerator for binarized neural network
A, Sasikumar
Ravi, Logesh
Kotecha, Ketan
V, Indragandhi
V, Subramaniyaswamy
Computers and Electrical Engineering, 2022, 102
[3] Physical Limitation Aware Quantization Model for Photonic Convolutional Neural Network
Jiang, Yue
Zhang, Wenjia
Yang, Fan
He, Zuyuan
2021 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2021,
[4] Deep Neural Network Compression Method Based on Product Quantization
Fang, Xiuqin
Liu, Han
Xie, Guo
Zhang, Youmin
Liu, Ding
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7035 - 7040
[5] Max-Variance Convolutional Neural Network Model Compression
Boone-Sifuentes, Tanya
Robles-Kelly, Antonio
Nazari, Asef
2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
[6] Model-based convolutional neural network approach to underwater source-range estimation
Chen, R.
Schmidt, H.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2021, 149 (01): : 405 - 420
[7] Model-based Fault Detection and Diagnosis for HVAC Systems Using Convolutional Neural Network
Miyata, Shohei
Akashi, Yasunori
Lim, Jongyeon
Kuwahara, Yasuhiro
Tanaka, Katsuhiko
PROCEEDINGS OF BUILDING SIMULATION 2019: 16TH CONFERENCE OF IBPSA, 2020, : 853 - 860
[8] Convolutional Neural Network Accelerator with Vector Quantization
Lee, Heng
Wu, Yi-Heng
Lin, Yu-Sheng
Chien, Shao-Yi
2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
[9] QNet: An Adaptive Quantization Table Generator Based on Convolutional Neural Network
Yan, Xiao
Fan, Yibo
Chen, Kewei
Yu, Xulin
Zeng, Xiaoyang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9654 - 9664
[10] Neural Network Language Model Compression With Product Quantization and Soft Binarization
Yu, Kai
Ma, Rao
Shi, Kaiyu
Liu, Qi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2438 - 2449

← 1 2 3 4 5 →