JOINT OPTIMIZATION OF QUANTIZATION AND STRUCTURED SPARSITY FOR COMPRESSED DEEP NEURAL NETWORKS

被引:0
|
作者
Srivastava, Gaurav [1 ]
Kadetotad, Deepak [1 ]
Yin, Shihui [1 ]
Berisha, Visar [1 ]
Chakrabarti, Chaitali [1 ]
Seo, Jae-sun [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The usage of Deep Neural Networks (DNN) on resource-constrained edge devices has been limited due to their high computation and large memory requirement. In this work, we propose an algorithm to compress DNNs by jointly optimizing structured sparsity and quantization constraints in a single DNN training framework. The proposed algorithm has been extensively validated on high/low capacity DNNs and wide/deep sparse DNNs. Further, we perform Pareto-optimal analysis to extract optimal DNN models from a large set of trained DNN models. The optimal structurally-compressed DNN model achieves similar to 50X weight memory reduction without test accuracy degradation, compared to floating-point un-compressed DNN.
引用
收藏
页码:1393 / 1397
页数:5
相关论文
共 50 条
  • [41] Evaluating Fault Resiliency of Compressed Deep Neural Networks
    Sabbagh, Majid
    Cheng Gongye
    Fei, Yunsi
    Wang, Yanzhi
    2019 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2019,
  • [42] Compressed Test Pattern Generation for Deep Neural Networks
    Moussa, Dina A.
    Hefenbrock, Michael
    Tahoori, Mehdi
    IEEE TRANSACTIONS ON COMPUTERS, 2025, 74 (01) : 307 - 315
  • [43] A Fast Compressed Hardware Architecture for Deep Neural Networks
    Ansari, Anaam
    Shelton, Allen
    Ogunfunmi, Tokunbo
    Panchbhaiyye, Vineet
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 370 - 374
  • [44] Joint Pruning and Channel-Wise Mixed-Precision Quantization for Efficient Deep Neural Networks
    Motetti, Beatrice Alessandra
    Risso, Matteo
    Burrello, Alessio
    Macii, Enrico
    Poncino, Massimo
    Pagliari, Daniele Jahier
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (11) : 2619 - 2633
  • [45] Quantized Compressed Sensing via Deep Neural Networks
    Leinonen, Markus
    Codreanu, Marian
    2020 2ND 6G WIRELESS SUMMIT (6G SUMMIT), 2020,
  • [46] SeReNe: Sensitivity-Based Regularization of Neurons for Structured Sparsity in Neural Networks
    Tartaglione, Enzo
    Bragagnolo, Andrea
    Odierna, Francesco
    Fiandrotti, Attilio
    Grangetto, Marco
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7237 - 7250
  • [47] Structured Sparsity of Convolutional Neural Networks via Nonconvex Sparse Group Regularization
    Bui, Kevin
    Park, Fredrick
    Zhang, Shuai
    Qi, Yingyong
    Xin, Jack
    FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2021, 6
  • [48] Structured Sparsity through Convex Optimization
    Bach, Francis
    Jenatton, Rodolphe
    Mairal, Julien
    Obozinski, Guillaume
    STATISTICAL SCIENCE, 2012, 27 (04) : 450 - 468
  • [49] Compressed Hyperspectral Image Sensing with Joint Sparsity Reconstruction
    Liu, Haiying
    Li, Yunsong
    Zhang, Jing
    Song, Juan
    Lv, Pei
    SATELLITE DATA COMPRESSION, COMMUNICATIONS, AND PROCESSING VII, 2011, 8157
  • [50] Sparsity Turns Adversarial: Energy and Latency Attacks on Deep Neural Networks
    Krithivasan, Sarada
    Sen, Sanchari
    Raghunathan, Anand
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 4129 - 4141