Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引:0
|
作者
Boo, Yoonho [1 ]
Sung, Wonyong [1 ]
机构
[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea
基金
新加坡国家研究基金会;
关键词
Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
    Boo, Yoonho
    Sung, Wonyong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (09): : 1009 - 1019
  • [2] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
    Yoonho Boo
    Wonyong Sung
    Journal of Signal Processing Systems, 2019, 91 : 1009 - 1019
  • [3] Compressing Sparse Ternary Weight Convolutional Neural Networks for Efficient Hardware Acceleration
    Wi, Hyeonwook
    Kim, Hyeonuk
    Choi, Seungkyu
    Kim, Lee-Sup
    2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2019,
  • [4] Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Arzel, Matthieu
    Farrugia, Nicolas
    Bengio, Yoshua
    2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 206 - 209
  • [5] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
    Zhu, Chaoyang
    Huang, Kejie
    Yang, Shuyuan
    Zhu, Ziqi
    Zhang, Hejia
    Shen, Haibin
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965
  • [6] An Efficient Hardware Accelerator for Sparse Transformer Neural Networks
    Fang, Chao
    Guo, Shouliang
    Wu, Wei
    Lin, Jun
    Wang, Zhongfeng
    Hsu, Ming Kai
    Liu, Lingzhi
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2670 - 2674
  • [7] Deep Neural Network Structured Sparse Coding for Online Processing
    Zhao, Haoli
    Ding, Shuxue
    Li, Xiang
    Huang, Huakun
    IEEE ACCESS, 2018, 6 : 74778 - 74791
  • [8] Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs
    Ding, Caiwen
    Ren, Ao
    Yuan, Geng
    Ma, Xiaolong
    Li, Jiayu
    Liu, Ning
    Yuan, Bo
    Wang, Yanzhi
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 353 - 358
  • [9] ELECTRONIC HARDWARE IMPLEMENTATIONS OF NEURAL NETWORKS
    THAKOOR, AP
    MOOPENN, A
    LAMBE, J
    KHANNA, SK
    APPLIED OPTICS, 1987, 26 (23) : 5085 - 5092
  • [10] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
    Xiao, Hao
    Zhao, Kaikai
    Liu, Guangzhu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 772 - 775