JOINT OPTIMIZATION OF QUANTIZATION AND STRUCTURED SPARSITY FOR COMPRESSED DEEP NEURAL NETWORKS

被引:0
|
作者
Srivastava, Gaurav [1 ]
Kadetotad, Deepak [1 ]
Yin, Shihui [1 ]
Berisha, Visar [1 ]
Chakrabarti, Chaitali [1 ]
Seo, Jae-sun [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The usage of Deep Neural Networks (DNN) on resource-constrained edge devices has been limited due to their high computation and large memory requirement. In this work, we propose an algorithm to compress DNNs by jointly optimizing structured sparsity and quantization constraints in a single DNN training framework. The proposed algorithm has been extensively validated on high/low capacity DNNs and wide/deep sparse DNNs. Further, we perform Pareto-optimal analysis to extract optimal DNN models from a large set of trained DNN models. The optimal structurally-compressed DNN model achieves similar to 50X weight memory reduction without test accuracy degradation, compared to floating-point un-compressed DNN.
引用
收藏
页码:1393 / 1397
页数:5
相关论文
共 50 条
  • [21] On the stable recovery of deep structured linear networks under sparsity constraints
    Malgouyres, Francois
    MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 107, 2020, 107 : 107 - 127
  • [22] Learning Low-Rank Structured Sparsity in Recurrent Neural Networks
    Wen, Weijing
    Yang, Fan
    Su, Yangfeng
    Zhou, Dian
    Zeng, Xuan
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [23] Deep Learning Methods for Joint Optimization of Beamforming and Fronthaul Quantization in Cloud Radio Access Networks
    Yu, Daesung
    Lee, Hoon
    Park, Seok-Hwan
    Hong, Seung-Eun
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2021, 10 (10) : 2180 - 2184
  • [24] Joint Optimization of Dimension Reduction and Mixed-Precision Quantization for Activation Compression of Neural Networks
    Tai, Yu-Shan
    Chang, Cheng-Yang
    Teng, Chieh-Fang
    Chen, Yi-Ta
    Wu, An-Yeu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 4025 - 4037
  • [25] JOINT SPARSITY RECOVERY FOR SPECTRAL COMPRESSED SENSING
    Chi, Yuejie
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [26] Sparsity-Aware Caches to Accelerate Deep Neural Networks
    Ganesan, Vinod
    Sen, Sanchari
    Kumar, Pratyush
    Gala, Neel
    Veezhinathan, Kamakoti
    Raghunathan, Anand
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 85 - 90
  • [27] Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks
    Xue, Anton
    Lindemann, Lars
    Robey, Alexander
    Hassani, Hamed
    Pappas, George J.
    Alur, Rajeev
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3389 - 3396
  • [28] POSTER: Exploiting the Input Sparsity to Accelerate Deep Neural Networks
    Dong, Xiao
    Liu, Lei
    Li, Guangli
    Li, Jiansong
    Zhao, Peng
    Wang, Xueying
    Feng, Xiaobing
    PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 401 - 402
  • [29] Acorns: A Framework for Accelerating Deep Neural Networks with Input Sparsity
    Dong, Xiao
    Liu, Lei
    Zhao, Peng
    Li, Guangli
    Li, Jiansong
    Wang, Xueying
    Feng, Xiaobing
    2019 28TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2019), 2019, : 178 - 191
  • [30] Sparsity-aware generalization theory for deep neural networks
    Muthukumar, Ramchandran
    Sulam, Jeremias
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195