JOINT OPTIMIZATION OF QUANTIZATION AND STRUCTURED SPARSITY FOR COMPRESSED DEEP NEURAL NETWORKS

被引:0
|
作者
Srivastava, Gaurav [1 ]
Kadetotad, Deepak [1 ]
Yin, Shihui [1 ]
Berisha, Visar [1 ]
Chakrabarti, Chaitali [1 ]
Seo, Jae-sun [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The usage of Deep Neural Networks (DNN) on resource-constrained edge devices has been limited due to their high computation and large memory requirement. In this work, we propose an algorithm to compress DNNs by jointly optimizing structured sparsity and quantization constraints in a single DNN training framework. The proposed algorithm has been extensively validated on high/low capacity DNNs and wide/deep sparse DNNs. Further, we perform Pareto-optimal analysis to extract optimal DNN models from a large set of trained DNN models. The optimal structurally-compressed DNN model achieves similar to 50X weight memory reduction without test accuracy degradation, compared to floating-point un-compressed DNN.
引用
收藏
页码:1393 / 1397
页数:5
相关论文
共 50 条
  • [1] Learning Structured Sparsity in Deep Neural Networks
    Wen, Wei
    Wu, Chunpeng
    Wang, Yandan
    Chen, Yiran
    Li, Hai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] Structured Dynamic Precision for Deep Neural Networks Quantization
    Huang, Kai
    Li, Bowen
    Xiong, Dongliang
    Jiang, Haitian
    Jiang, Xiaowen
    Yan, Xiaolang
    Claesen, Luc
    Liu, Dehong
    Chen, Junjian
    Liu, Zhili
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (01)
  • [3] Variance-Guided Structured Sparsity in Deep Neural Networks
    Pandit M.K.
    Banday M.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1714 - 1723
  • [4] Feature flow regularization: Improving structured sparsity in deep neural networks
    Wu, Yue
    Lan, Yuan
    Zhang, Luchan
    Xiang, Yang
    NEURAL NETWORKS, 2023, 161 : 598 - 613
  • [5] Dataflow-based Joint Quantization for Deep Neural Networks
    Geng, Xue
    Fu, Jie
    Zhao, Bin
    Lin, Jie
    Aly, Mohamed M. Sabry
    Pal, Christopher
    Chandrasekhar, Vijay
    2019 DATA COMPRESSION CONFERENCE (DCC), 2019, : 574 - 574
  • [6] Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks
    Yang, Tzu-Hsien
    Cheng, Hsiang-Yun
    Yang, Chia-Lin
    Tseng, I-Ching
    Hu, Han-Wen
    Chang, Hung-Sheng
    Li, Hsiang-Pang
    PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 236 - 249
  • [7] Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity
    Grimaldi, Matteo
    Ganji, Darshan C.
    Lazarevich, Ivan
    Sah, Sudhakar
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1171 - 1180
  • [8] Structured Pruning for Deep Convolutional Neural Networks via Adaptive Sparsity Regularization
    Shao, Tuanjie
    Shin, Dongkun
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 982 - 987
  • [9] Addressing Sparsity in Deep Neural Networks
    Zhou, Xuda
    Du, Zidong
    Zhang, Shijin
    Zhang, Lei
    Lan, Huiying
    Liu, Shaoli
    Li, Ling
    Guo, Qi
    Chen, Tianshi
    Chen, Yunji
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (10) : 1858 - 1871
  • [10] Compressed sensing with structured sparsity and structured acquisition
    Boyer, Claire
    Bigot, Jeremie
    Weiss, Pierre
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2019, 46 (02) : 312 - 350