Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引:0
|
作者
Boo, Yoonho [1 ]
Sung, Wonyong [1 ]
机构
[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea
基金
新加坡国家研究基金会;
关键词
Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
    Yin, Xiaodi
    Wu, Zhipeng
    Li, Dejian
    Shen, Chongfei
    Liu, Yu
    IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
  • [22] Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI
    Barlaud, Michel
    Guyard, Frederic
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1566 - 1573
  • [23] Hardware Efficient Weight-Binarized Spiking Neural Networks
    Tang, Chengcheng
    Han, Jie
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [24] RadiX-Net: Structured Sparse Matrices for Deep Neural Networks
    Robinett, Ryan A.
    Kepner, Jeremy
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 268 - 274
  • [25] Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey
    Dhilleswararao, Pudi
    Boppu, Srinivas
    Manikandan, M. Sabarimalai
    Cenkeramaddi, Linga Reddy
    IEEE ACCESS, 2022, 10 : 131788 - 131828
  • [26] Hardware Efficient Convolution Processing Unit for Deep Neural Networks
    Hazarika, Anakhi
    Poddar, Soumyajit
    Rahaman, Hafizur
    2019 2ND INTERNATIONAL SYMPOSIUM ON DEVICES, CIRCUITS AND SYSTEMS (ISDCS 2019), 2019,
  • [27] Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks
    Luo, Weixin
    Liu, Wen
    Lian, Dongze
    Tang, Jinhui
    Duan, Lixin
    Peng, Xi
    Gao, Shenghua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 1070 - 1084
  • [28] Efficient Bayesian Learning of Sparse Deep Artificial Neural Networks
    Fakhfakh, Mohamed
    Bouaziz, Bassem
    Chaari, Lotfi
    Gargouri, Faiez
    ADVANCES IN INTELLIGENT DATA ANALYSIS XX, IDA 2022, 2022, 13205 : 78 - 88
  • [29] Neural Networks: Efficient Implementations and Applications
    Zhang, Chuan
    Xu, Weihong
    2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 1029 - 1032
  • [30] Splatter: An Efficient Sparse Image Convolution for Deep Neural Networks
    Lee, Tristan
    Lee, Byeong Kil
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 506 - 509