Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引:5
|
作者
Shin, Hoon [1 ]
Park, Rihae [1 ]
Lee, Seung Yul [1 ]
Park, Yeonhong [1 ]
Lee, Hyunseung [1 ]
Lee, Jae W. [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
关键词
D O I
10.1145/3489517.3530564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.
引用
收藏
页码:949 / 954
页数:6
相关论文
共 50 条
  • [21] Mathematical Framework for Optimizing Crossbar Allocation for ReRAM-based CNN Accelerators
    Li, Wanqian
    Han, Yinhe
    Chen, Xiaoming
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (01)
  • [22] Trained Biased Number Representation for ReRAM-Based Neural Network Accelerators
    Wang, Weijia
    Lin, Bill
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (02)
  • [23] PattPIM: A Practical ReRAM-Based DNN Accelerator by Reusing Weight Pattern Repetitions
    Zhang, Yuhao
    Jia, Zhiping
    Pan, Yungang
    Du, Hongchao
    Shen, Zhaoyan
    Zhao, Mengying
    Shao, Zili
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [24] Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution
    Li, Jingtao
    Mao, Manqing
    Chakrabarti, Chaitali
    PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 189 - 194
  • [25] A Quantized Training Framework for Robust and Accurate ReRAM-based Neural Network Accelerators
    Zhang, Chenguang
    Zhou, Pingqiang
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 43 - 48
  • [26] REC: REtime Convolutional layers in energy harvesting ReRAM-based CNN accelerators
    Zhou, Kunyu
    Qiu, Keni
    PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2022 (CF 2022), 2022, : 185 - 188
  • [27] FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators
    Dhingra, Pratyush
    Ogbogu, Chukwufumnanya
    Joardar, Biresh Kumar
    Doppa, Janardhan Rao
    Kalyanaraman, Ananth
    Pande, Partha Pratim
    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
  • [28] A Practical Highly Paralleled ReRAM-Based DNN Accelerator by Reusing Weight Pattern Repetitions
    Zhang, Yuhao
    Jia, Zhiping
    Du, Hongchao
    Xue, Runzhen
    Shen, Zhaoyan
    Shao, Zili
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 922 - 935
  • [29] Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators
    Azamat, Azat
    Asim, Faaiz
    Lee, Jongeun
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [30] On Minimizing Analog Variation Errors to Resolve the Scalability Issue of ReRAM-Based Crossbar Accelerators
    Kang, Yao-Wen
    Wu, Chun-Feng
    Chang, Yuan-Hao
    Kuo, Tei-Wei
    Ho, Shu-Yin
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 3856 - 3867