Space Efficient Quantization for Deep Convolutional Neural Networks

被引:0
|
作者
Dong-Di Zhao
Fan Li
Kashif Sharif
Guang-Min Xia
Yu Wang
机构
[1] Beijing Institute of Technology,School of Computer Science
[2] University of North Carolina at Charlotte,Wireless Networking and Sensing Laboratory, Department of Computer Science
关键词
convolutional neural network; memory compression; network quantization;
D O I
暂无
中图分类号
学科分类号
摘要
Deep convolutional neural networks (DCNNs) have shown outstanding performance in the fields of computer vision, natural language processing, and complex system analysis. With the improvement of performance with deeper layers, DCNNs incur higher computational complexity and larger storage requirement, making it extremely difficult to deploy DCNNs on resource-limited embedded systems (such as mobile devices or Internet of Things devices). Network quantization efficiently reduces storage space required by DCNNs. However, the performance of DCNNs often drops rapidly as the quantization bit reduces. In this article, we propose a space efficient quantization scheme which uses eight or less bits to represent the original 32-bit weights. We adopt singular value decomposition (SVD) method to decrease the parameter size of fully-connected layers for further compression. Additionally, we propose a weight clipping method based on dynamic boundary to improve the performance when using lower precision. Experimental results demonstrate that our approach can achieve up to approximately 14x compression while preserving almost the same accuracy compared with the full-precision models. The proposed weight clipping method can also significantly improve the performance of DCNNs when lower precision is required.
引用
收藏
页码:305 / 317
页数:12
相关论文
共 50 条
  • [1] Space Efficient Quantization for Deep Convolutional Neural Networks
    Zhao, Dong-Di
    Li, Fan
    Sharif, Kashif
    Xia, Guang-Min
    Wang, Yu
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) : 305 - 317
  • [2] Flexible Quantization for Efficient Convolutional Neural Networks
    Zacchigna, Federico Giordano
    Lew, Sergio
    Lutenberg, Ariel
    [J]. ELECTRONICS, 2024, 13 (10)
  • [3] Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space
    Abrishami, Mohammad Saeed
    Eshratifar, Amir Erfan
    Eigen, David
    Wang, Yanzhi
    Nazarian, Shahin
    Pedram, Massoud
    [J]. PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 347 - 351
  • [4] Bit Efficient Quantization for Deep Neural Networks
    Nayak, Prateeth
    Zhang, David
    Chai, Sek
    [J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
  • [5] Optimized Quantization for Convolutional Deep Neural Networks in Federated Learning
    Kim, You Jun
    Hong, Choong Seon
    [J]. APNOMS 2020: 2020 21ST ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS), 2020, : 150 - 154
  • [6] Vector Quantization of Deep Convolutional Neural Networks With Learned Codebook
    Yang, Siyuan
    Mao, Yongyi
    [J]. 2022 17TH CANADIAN WORKSHOP ON INFORMATION THEORY (CWIT), 2022, : 39 - 44
  • [7] An Efficient Accelerator for Deep Convolutional Neural Networks
    Kuo, Yi-Xian
    Lai, Yeong-Kang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [8] Quantization of Deep Convolutional Networks
    Huang, Yea-Shuan
    Slot, Charles Djimy
    Yu, Chang Wu
    [J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [9] Hybrid Approach for Efficient Quantization of Weights in Convolutional Neural Networks
    Seo, Sanghyun
    Kim, Juntae
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 638 - 641
  • [10] General Bitwidth Assignment for Efficient Deep Convolutional Neural Network Quantization
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) : 5253 - 5267