A Memory-Efficient Edge Inference Accelerator with XOR-based Model Compression

被引:0
|
作者
Lee, Hyunseung [1 ]
Hong, Jihoon [1 ]
Kim, Soosung [1 ]
Lee, Seung Yul [1 ]
Lee, Jae W. [1 ]
机构
[1] Seoul Natl Univ, Seoul 08826, South Korea
关键词
D O I
10.1109/DAC56929.2023.10248005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model compression is widely adopted for edge inference of neural networks (NNs) to minimize both costly DRAM accesses and memory footprints. Recently, XOR-based model compression has demonstrated promising results to maximize compression ratio and minimize accuracy drop. However, XOR-based decompression alone produces bit errors and requires auxiliary data for error correction. To minimize model size and hence DRAM traffic, we propose an enhanced decompression algorithm and a low-cost hardware accelerator for it. Since not all errors are equal, our algorithm selects only important errors to correct with no accuracy drop. Compared with the baseline XOR compression scheme correcting all errors, the compressed model size of ResNet-18 and VGG-16 is reduced by 23% and 27% respectively. We also present a low-cost hardware implementation of on-line XOR decompression and error-correction logic built on Gemmini, an open-source systolic array accelerator, at the cost of only a 0.39% and 0.46% increase in area and power.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
    Shao, Zhuang
    Chen, Xiaoliang
    Du, Li
    Chen, Lei
    Du, Yuan
    Zhuang, Wei
    Wei, Huadong
    Xie, Chenjia
    Wang, Zhongfeng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (02) : 668 - 681
  • [2] Sparse Bitmap Compression for Memory-Efficient Training on the Edge
    Hosny, Abdelrahman
    Neseem, Marina
    Reda, Sherief
    [J]. 2021 ACM/IEEE 6TH SYMPOSIUM ON EDGE COMPUTING (SEC 2021), 2021, : 14 - 25
  • [3] Smart-DNN plus : A Memory-efficient Neural Networks Compression Framework for the Model Inference
    Wu, Donglei
    Yang, Weihao
    Zou, Xiangyu
    Xia, Wen
    Li, Shiyi
    Hu, Zhenbo
    Zhang, Weizhe
    Fang, Binxing
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2023, 20 (04)
  • [4] A New XOR-Based Content Addressable Memory Architecture
    Frontini, Luca
    Shojaii, Seyedruhollah
    Stabile, Alberto
    Liberali, Valentino
    [J]. 2012 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 2012, : 701 - 704
  • [5] Efficient Encoding Schedules for XOR-Based Erasure Codes
    Luo, Jianqiang
    Shrestha, Mochan
    Xu, Lihao
    Plank, James S.
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (09) : 2259 - 2272
  • [6] An efficient XOR-based verifiable visual cryptographic scheme
    Jia, Xingxing
    Wang, Daoshun
    Chu, Qimeng
    Chen, Zhenhua
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (07) : 8207 - 8223
  • [7] An efficient XOR-based verifiable visual cryptographic scheme
    Xingxing Jia
    Daoshun Wang
    Qimeng Chu
    Zhenhua Chen
    [J]. Multimedia Tools and Applications, 2019, 78 : 8207 - 8223
  • [8] PENETRALIUM: Privacy-preserving and memory-efficient neural network inference at the edge
    Yang, Mengda
    Yi, Wenzhe
    Wang, Juan
    Hu, Hongxin
    Xu, Xiaoyang
    Li, Ziang
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 156 : 30 - 41
  • [9] PENETRALIUM: Privacy-preserving and memory-efficient neural network inference at the edge
    Yang, Mengda
    Yi, Wenzhe
    Wang, Juan
    Hu, Hongxin
    Xu, Xiaoyang
    Li, Ziang
    [J]. Future Generation Computer Systems, 2024, 156 : 30 - 41
  • [10] Modeling efficient XOR-based hash functions for cache memories
    Cho, Sung-Jin
    Choi, Un-Sook
    Hwang, Yoon-Hee
    Kim, Han-Doo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 1, PROCEEDINGS, 2006, 3991 : 1067 - 1070