Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引:0
|
作者
Boo, Yoonho [1 ]
Sung, Wonyong [1 ]
机构
[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea
基金
新加坡国家研究基金会;
关键词
Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio
    Liu, Xiao
    Li, Wenbin
    Huo, Jing
    Yao, Lili
    Gao, Yang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4900 - 4907
  • [42] An Efficient and Fast Softmax Hardware Architecture (EFSHA) for Deep Neural Networks
    Hussain, Muhammad Awais
    Tsai, Tsung-Han
    2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
  • [43] Efficient Hardware Optimization Strategies for Deep Neural Networks Acceleration Chip
    Zhang Meng
    Zhang Jingwei
    Li Guoqing
    Wu Ruixia
    Zeng Xiaoyang
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1510 - 1517
  • [44] Sparse coding of pathology slides compared to transfer learning with deep neural networks
    Fischer, Will
    Moudgalya, Sanketh S.
    Cohn, Judith D.
    Nguyen, Nga T. T.
    Kenyon, Garrett T.
    BMC BIOINFORMATICS, 2018, 19
  • [45] Efficient Hardware Acceleration for Approximate Inference of Bitwise Deep Neural Networks
    Vogel, Sebastian
    Guntoro, Andre
    Ascheid, Gerd
    2017 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING (DASIP), 2017,
  • [46] DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip
    Zhou, Xichuan
    Li, Shengli
    Tang, Fang
    Hu, Shengdong
    Lin, Zhi
    Zhang, Lei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 3176 - 3187
  • [47] Novel architecture and synapse design for hardware implementations of neural networks
    Faculty of Engineering, Magee College, University of Ulster, Northland Road, Derry, BT48 7JL, United Kingdom
    Comput Electr Eng, 1-2 (75-87):
  • [48] Novel architecture and synapse design for hardware implementations of neural networks
    McGinnity, TM
    Roche, B
    Maguire, LP
    McDaid, LJ
    COMPUTERS & ELECTRICAL ENGINEERING, 1998, 24 (1-2) : 75 - 87
  • [49] HARDWARE IMPLEMENTATIONS OF MLP ARTIFICIAL NEURAL NETWORKS WITH CONFIGURABLE TOPOLOGY
    Da Silva, Rodrigo Martins
    Nedjah, Nadia
    Mourelle, Luiza De Macedo
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2011, 20 (03) : 417 - 437
  • [50] Interleaved Structured Sparse Convolutional Neural Networks
    Xie, Guotian
    Wang, Jingdong
    Zhang, Ting
    Lai, Jianhuang
    Hong, Richang
    Qi, Guo-Jun
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8847 - 8856