Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

被引：0

作者：

Boo, Yoonho ^{[1
]}

Sung, Wonyong ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Elect Engn & Comp Sci, Seoul 151744, South Korea

来源：

2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS) | 2017年

基金：

新加坡国家研究基金会;

关键词：

Deep neural networks; weight storage compression; structured sparsity; fixed-point quantization; network pruning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of + 1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks.

引用

页数：6

共 50 条

[21] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
Yin, Xiaodi
Wu, Zhipeng
Li, Dejian
Shen, Chongfei
Liu, Yu
IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
[22] Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI
Barlaud, Michel
Guyard, Frederic
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1566 - 1573
[23] Hardware Efficient Weight-Binarized Spiking Neural Networks
Tang, Chengcheng
Han, Jie
2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
[24] RadiX-Net: Structured Sparse Matrices for Deep Neural Networks
Robinett, Ryan A.
Kepner, Jeremy
2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 268 - 274
[25] Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey
Dhilleswararao, Pudi
Boppu, Srinivas
Manikandan, M. Sabarimalai
Cenkeramaddi, Linga Reddy
IEEE ACCESS, 2022, 10 : 131788 - 131828
[26] Hardware Efficient Convolution Processing Unit for Deep Neural Networks
Hazarika, Anakhi
Poddar, Soumyajit
Rahaman, Hafizur
2019 2ND INTERNATIONAL SYMPOSIUM ON DEVICES, CIRCUITS AND SYSTEMS (ISDCS 2019), 2019,
[27] Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks
Luo, Weixin
Liu, Wen
Lian, Dongze
Tang, Jinhui
Duan, Lixin
Peng, Xi
Gao, Shenghua
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 1070 - 1084
[28] Efficient Bayesian Learning of Sparse Deep Artificial Neural Networks
Fakhfakh, Mohamed
Bouaziz, Bassem
Chaari, Lotfi
Gargouri, Faiez
ADVANCES IN INTELLIGENT DATA ANALYSIS XX, IDA 2022, 2022, 13205 : 78 - 88
[29] Neural Networks: Efficient Implementations and Applications
Zhang, Chuan
Xu, Weihong
2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 1029 - 1032
[30] Splatter: An Efficient Sparse Image Convolution for Deep Neural Networks
Lee, Tristan
Lee, Byeong Kil
2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 506 - 509

← 1 2 3 4 5 →