Space Efficient Quantization for Deep Convolutional Neural Networks

被引：0

作者：

Dong-Di Zhao

Fan Li

Kashif Sharif

Guang-Min Xia

Yu Wang

机构：

[1] Beijing Institute of Technology,School of Computer Science

[2] University of North Carolina at Charlotte,Wireless Networking and Sensing Laboratory, Department of Computer Science

来源：

Journal of Computer Science and Technology | 2019年 / 34卷

关键词：

convolutional neural network; memory compression; network quantization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep convolutional neural networks (DCNNs) have shown outstanding performance in the fields of computer vision, natural language processing, and complex system analysis. With the improvement of performance with deeper layers, DCNNs incur higher computational complexity and larger storage requirement, making it extremely difficult to deploy DCNNs on resource-limited embedded systems (such as mobile devices or Internet of Things devices). Network quantization efficiently reduces storage space required by DCNNs. However, the performance of DCNNs often drops rapidly as the quantization bit reduces. In this article, we propose a space efficient quantization scheme which uses eight or less bits to represent the original 32-bit weights. We adopt singular value decomposition (SVD) method to decrease the parameter size of fully-connected layers for further compression. Additionally, we propose a weight clipping method based on dynamic boundary to improve the performance when using lower precision. Experimental results demonstrate that our approach can achieve up to approximately 14x compression while preserving almost the same accuracy compared with the full-precision models. The proposed weight clipping method can also significantly improve the performance of DCNNs when lower precision is required.

引用

页码：305 / 317

页数：12

共 50 条

[1] Space Efficient Quantization for Deep Convolutional Neural Networks
Zhao, Dong-Di
Li, Fan
Sharif, Kashif
Xia, Guang-Min
Wang, Yu
[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) : 305 - 317
[2] Flexible Quantization for Efficient Convolutional Neural Networks
Zacchigna, Federico Giordano
Lew, Sergio
Lutenberg, Ariel
[J]. ELECTRONICS, 2024, 13 (10)
[3] Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space
Abrishami, Mohammad Saeed
Eshratifar, Amir Erfan
Eigen, David
Wang, Yanzhi
Nazarian, Shahin
Pedram, Massoud
[J]. PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 347 - 351
[4] Bit Efficient Quantization for Deep Neural Networks
Nayak, Prateeth
Zhang, David
Chai, Sek
[J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
[5] Optimized Quantization for Convolutional Deep Neural Networks in Federated Learning
Kim, You Jun
Hong, Choong Seon
[J]. APNOMS 2020: 2020 21ST ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS), 2020, : 150 - 154
[6] Vector Quantization of Deep Convolutional Neural Networks With Learned Codebook
Yang, Siyuan
Mao, Yongyi
[J]. 2022 17TH CANADIAN WORKSHOP ON INFORMATION THEORY (CWIT), 2022, : 39 - 44
[7] An Efficient Accelerator for Deep Convolutional Neural Networks
Kuo, Yi-Xian
Lai, Yeong-Kang
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
[8] Quantization of Deep Convolutional Networks
Huang, Yea-Shuan
Slot, Charles Djimy
Yu, Chang Wu
[J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
[9] Hybrid Approach for Efficient Quantization of Weights in Convolutional Neural Networks
Seo, Sanghyun
Kim, Juntae
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 638 - 641
[10] General Bitwidth Assignment for Efficient Deep Convolutional Neural Network Quantization
Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) : 5253 - 5267

← 1 2 3 4 5 →