Effective Zero Compression on ReRAM-based Sparse DNN Accelerators

被引：5

作者：

Shin, Hoon ^{[1
]}

Park, Rihae ^{[1
]}

Lee, Seung Yul ^{[1
]}

Park, Yeonhong ^{[1
]}

Lee, Hyunseung ^{[1
]}

Lee, Jae W. ^{[1
]}

机构：

[1] Seoul Natl Univ, Seoul, South Korea

来源：

PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022 | 2022年

关键词：

D O I：

10.1145/3489517.3530564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area-and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU-based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27-4.26x of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.

引用

页码：949 / 954

页数：6

共 50 条

[21] Mathematical Framework for Optimizing Crossbar Allocation for ReRAM-based CNN Accelerators
Li, Wanqian
Han, Yinhe
Chen, Xiaoming
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (01)
[22] Trained Biased Number Representation for ReRAM-Based Neural Network Accelerators
Wang, Weijia
Lin, Bill
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (02)
[23] PattPIM: A Practical ReRAM-Based DNN Accelerator by Reusing Weight Pattern Repetitions
Zhang, Yuhao
Jia, Zhiping
Pan, Yungang
Du, Hongchao
Shen, Zhaoyan
Zhao, Mengying
Shao, Zili
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[24] Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution
Li, Jingtao
Mao, Manqing
Chakrabarti, Chaitali
PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 189 - 194
[25] A Quantized Training Framework for Robust and Accurate ReRAM-based Neural Network Accelerators
Zhang, Chenguang
Zhou, Pingqiang
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 43 - 48
[26] REC: REtime Convolutional layers in energy harvesting ReRAM-based CNN accelerators
Zhou, Kunyu
Qiu, Keni
PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2022 (CF 2022), 2022, : 185 - 188
[27] FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators
Dhingra, Pratyush
Ogbogu, Chukwufumnanya
Joardar, Biresh Kumar
Doppa, Janardhan Rao
Kalyanaraman, Ananth
Pande, Partha Pratim
2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
[28] A Practical Highly Paralleled ReRAM-Based DNN Accelerator by Reusing Weight Pattern Repetitions
Zhang, Yuhao
Jia, Zhiping
Du, Hongchao
Xue, Runzhen
Shen, Zhaoyan
Shao, Zili
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 922 - 935
[29] Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators
Azamat, Azat
Asim, Faaiz
Lee, Jongeun
2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
[30] On Minimizing Analog Variation Errors to Resolve the Scalability Issue of ReRAM-Based Crossbar Accelerators
Kang, Yao-Wen
Wu, Chun-Feng
Chang, Yuan-Hao
Kuo, Tei-Wei
Ho, Shu-Yin
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) : 3856 - 3867

← 1 2 3 4 5 →