A Memory-Efficient Edge Inference Accelerator with XOR-based Model Compression

被引：0

作者：

Lee, Hyunseung ^{[1
]}

Hong, Jihoon ^{[1
]}

Kim, Soosung ^{[1
]}

Lee, Seung Yul ^{[1
]}

Lee, Jae W. ^{[1
]}

机构：

[1] Seoul Natl Univ, Seoul 08826, South Korea

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

关键词：

D O I：

10.1109/DAC56929.2023.10248005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Model compression is widely adopted for edge inference of neural networks (NNs) to minimize both costly DRAM accesses and memory footprints. Recently, XOR-based model compression has demonstrated promising results to maximize compression ratio and minimize accuracy drop. However, XOR-based decompression alone produces bit errors and requires auxiliary data for error correction. To minimize model size and hence DRAM traffic, we propose an enhanced decompression algorithm and a low-cost hardware accelerator for it. Since not all errors are equal, our algorithm selects only important errors to correct with no accuracy drop. Compared with the baseline XOR compression scheme correcting all errors, the compressed model size of ResNet-18 and VGG-16 is reduced by 23% and 27% respectively. We also present a low-cost hardware implementation of on-line XOR decompression and error-correction logic built on Gemmini, an open-source systolic array accelerator, at the cost of only a 0.39% and 0.46% increase in area and power.

引用

页数：6

共 50 条

[1] Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
Shao, Zhuang
Chen, Xiaoliang
Du, Li
Chen, Lei
Du, Yuan
Zhuang, Wei
Wei, Huadong
Xie, Chenjia
Wang, Zhongfeng
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (02) : 668 - 681
[2] Sparse Bitmap Compression for Memory-Efficient Training on the Edge
Hosny, Abdelrahman
Neseem, Marina
Reda, Sherief
[J]. 2021 ACM/IEEE 6TH SYMPOSIUM ON EDGE COMPUTING (SEC 2021), 2021, : 14 - 25
[3] Smart-DNN plus : A Memory-efficient Neural Networks Compression Framework for the Model Inference
Wu, Donglei
Yang, Weihao
Zou, Xiangyu
Xia, Wen
Li, Shiyi
Hu, Zhenbo
Zhang, Weizhe
Fang, Binxing
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2023, 20 (04)
[4] A New XOR-Based Content Addressable Memory Architecture
Frontini, Luca
Shojaii, Seyedruhollah
Stabile, Alberto
Liberali, Valentino
[J]. 2012 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2012, : 701 - 704
[5] Efficient Encoding Schedules for XOR-Based Erasure Codes
Luo, Jianqiang
Shrestha, Mochan
Xu, Lihao
Plank, James S.
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (09) : 2259 - 2272
[6] An efficient XOR-based verifiable visual cryptographic scheme
Jia, Xingxing
Wang, Daoshun
Chu, Qimeng
Chen, Zhenhua
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (07) : 8207 - 8223
[7] An efficient XOR-based verifiable visual cryptographic scheme
Xingxing Jia
Daoshun Wang
Qimeng Chu
Zhenhua Chen
[J]. Multimedia Tools and Applications, 2019, 78 : 8207 - 8223
[8] PENETRALIUM: Privacy-preserving and memory-efficient neural network inference at the edge
Yang, Mengda
Yi, Wenzhe
Wang, Juan
Hu, Hongxin
Xu, Xiaoyang
Li, Ziang
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 156 : 30 - 41
[9] PENETRALIUM: Privacy-preserving and memory-efficient neural network inference at the edge
Yang, Mengda
Yi, Wenzhe
Wang, Juan
Hu, Hongxin
Xu, Xiaoyang
Li, Ziang
[J]. Future Generation Computer Systems, 2024, 156 : 30 - 41
[10] Modeling efficient XOR-based hash functions for cache memories
Cho, Sung-Jin
Choi, Un-Sook
Hwang, Yoon-Hee
Kim, Han-Doo
[J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 1, PROCEEDINGS, 2006, 3991 : 1067 - 1070

← 1 2 3 4 5 →