Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引:0
|
作者
Boo, Yoonho [1 ]
Sung, Wonyong [1 ]
机构
[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea
基金
新加坡国家研究基金会;
关键词
Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
    Boo, Yoonho
    Sung, Wonyong
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (09): : 1009 - 1019
  • [2] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
    Yoonho Boo
    Wonyong Sung
    [J]. Journal of Signal Processing Systems, 2019, 91 : 1009 - 1019
  • [3] Compressing Sparse Ternary Weight Convolutional Neural Networks for Efficient Hardware Acceleration
    Wi, Hyeonwook
    Kim, Hyeonuk
    Choi, Seungkyu
    Kim, Lee-Sup
    [J]. 2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2019,
  • [4] Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Arzel, Matthieu
    Farrugia, Nicolas
    Bengio, Yoshua
    [J]. 2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 206 - 209
  • [5] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
    Zhu, Chaoyang
    Huang, Kejie
    Yang, Shuyuan
    Zhu, Ziqi
    Zhang, Hejia
    Shen, Haibin
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965
  • [6] An Efficient Hardware Accelerator for Sparse Transformer Neural Networks
    Fang, Chao
    Guo, Shouliang
    Wu, Wei
    Lin, Jun
    Wang, Zhongfeng
    Hsu, Ming Kai
    Liu, Lingzhi
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2670 - 2674
  • [7] Deep Neural Network Structured Sparse Coding for Online Processing
    Zhao, Haoli
    Ding, Shuxue
    Li, Xiang
    Huang, Huakun
    [J]. IEEE ACCESS, 2018, 6 : 74778 - 74791
  • [8] Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs
    Ding, Caiwen
    Ren, Ao
    Yuan, Geng
    Ma, Xiaolong
    Li, Jiayu
    Liu, Ning
    Yuan, Bo
    Wang, Yanzhi
    [J]. PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 353 - 358
  • [9] ELECTRONIC HARDWARE IMPLEMENTATIONS OF NEURAL NETWORKS
    THAKOOR, AP
    MOOPENN, A
    LAMBE, J
    KHANNA, SK
    [J]. APPLIED OPTICS, 1987, 26 (23) : 5085 - 5092
  • [10] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
    Xiao, Hao
    Zhao, Kaikai
    Liu, Guangzhu
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05): : 772 - 775