Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引：0

作者：

Boo, Yoonho ^{[1
]}

Sung, Wonyong ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea

来源：

2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS) | 2017年

基金：

新加坡国家研究基金会;

关键词：

Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.

引用

页数：6

共 50 条

[1] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
Boo, Yoonho
Sung, Wonyong
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (09): : 1009 - 1019
[2] Compression of Deep Neural Networks with Structured Sparse Ternary Coding
Yoonho Boo
Wonyong Sung
Journal of Signal Processing Systems, 2019, 91 : 1009 - 1019
[3] Compressing Sparse Ternary Weight Convolutional Neural Networks for Efficient Hardware Acceleration
Wi, Hyeonwook
Kim, Hyeonuk
Choi, Seungkyu
Kim, Lee-Sup
2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2019,
[4] Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks
Hacene, Ghouthi Boukli
Gripon, Vincent
Arzel, Matthieu
Farrugia, Nicolas
Bengio, Yoshua
2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 206 - 209
[5] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
Zhu, Chaoyang
Huang, Kejie
Yang, Shuyuan
Zhu, Ziqi
Zhang, Hejia
Shen, Haibin
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965
[6] An Efficient Hardware Accelerator for Sparse Transformer Neural Networks
Fang, Chao
Guo, Shouliang
Wu, Wei
Lin, Jun
Wang, Zhongfeng
Hsu, Ming Kai
Liu, Lingzhi
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2670 - 2674
[7] Deep Neural Network Structured Sparse Coding for Online Processing
Zhao, Haoli
Ding, Shuxue
Li, Xiang
Huang, Huakun
IEEE ACCESS, 2018, 6 : 74778 - 74791
[8] Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs
Ding, Caiwen
Ren, Ao
Yuan, Geng
Ma, Xiaolong
Li, Jiayu
Liu, Ning
Yuan, Bo
Wang, Yanzhi
PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 353 - 358
[9] ELECTRONIC HARDWARE IMPLEMENTATIONS OF NEURAL NETWORKS
THAKOOR, AP
MOOPENN, A
LAMBE, J
KHANNA, SK
APPLIED OPTICS, 1987, 26 (23) : 5085 - 5092
[10] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
Xiao, Hao
Zhao, Kaikai
Liu, Guangzhu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 772 - 775

← 1 2 3 4 5 →