FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

被引：35

作者：

Yuan, Geng ^{[1
]}

Behnam, Payman ^{[2
]}

Li, Zhengang ^{[1
]}

Shafiee, Ali ^{[3
]}

Lin, Sheng ^{[1
]}

Ma, Xiaolong ^{[1
]}

Liu, Hang ^{[4
]}

Qian, Xuehai ^{[5
]}

Bojnordi, Mahdi Nazm ^{[6
]}

Wang, Yanzhi ^{[1
]}

Ding, Caiwen ^{[7
]}

机构：

[1] Northeastern Univ, Boston, MA 02115 USA

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

[3] Samsung, Seoul, South Korea

[4] Stevens Inst Technol, Hoboken, NJ 07030 USA

[5] Univ Southern Calif, Los Angeles, CA 90089 USA

[6] Univ Utah, Salt Lake City, UT 84112 USA

[7] Univ Connecticut, Storrs, CT USA

来源：

2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021) | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/ISCA52012.2021.00029

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent work demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication-the intensive and key computation in deep neural networks (DNNs). One key problem is the weights that are signed values. However, in a ReRAM crossbar, weights are stored as conductance of the crossbar cells, and the in-situ computation assumes all cells on each crossbar column are of the same sign. The current architectures either use two ReRAM crossbars for positive and negative weights (PRIME), or add an offset to weights so that all values become positive (ISAAC). Neither solution is ideal: they either double the cost of crossbars, or incur extra offset circuity. To better address this problem, we propose FORMS, a fine-grained ReRAM-based DNN accelerator with algorithm/hardware co-design. Instead of trying to represent the positive/negative weights, our key design principle is to enforce exactly what is assumed in the in-situ computation-ensuring that all weights in the same column of a crossbar have the same sign. It naturally avoids the cost of an additional crossbar. Such polarized weights can be nicely generated using alternating direction method of multipliers (ADMM) regularized optimization during the DNN training, which can exactly enforce certain patterns in DNN weights. To achieve high accuracy, we divide the crossbar into logical sub-arrays and only enforce this property within the fine-grained sub-array columns Crucially, the small sub-arrays provides a unique opportunity for input zero-skipping, which can significantly avoid unnecessary computations and reduce computation time. At the same time, it also makes the hardware much easier to implement and is less susceptible to non-idealities and noise than coarse-grained architectures. Putting all together, with the same optimized DNN models, FORMS achieves 1.50x and 1.93 x throughput improvement in terms of GOPs/epsilon x mm(2) and GOPs/W compared to ISAAC, and 1.12x similar to 2.4x speed up in terms of frame per second over optimized ISAAC with almost the same power/area cost. Interestingly, FORMS optimization framework can even speed up the original ISAAC from 10.7 x up to 377.9 x, reflecting the importance of software/hardware co-design optimizations.

引用

页码：265 / 278

页数：14

共 5 条

[1] ReRAM-Sharing: Fine-Grained Weight Sharing for ReRAM-Based Deep Neural Network Accelerator
Song, Zhuoran
Li, Dongyue
He, Zhezhi
Liang, Xiaoyao
Jiang, Li
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[2] ERA-BS: Boosting the Efficiency of ReRAM-Based PIM Accelerator With Fine-Grained Bit-Level Sparsity
Liu, Fangxin
Zhao, Wenbo
Wang, Zongwu
Chen, Yongbiao
Liang, Xiaoyao
Jiang, Li
IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (09) : 2320 - 2334
[3] ADC-Free ReRAM-Based In-Situ Accelerator for Energy-Efficient Binary Neural Networks
Kim, Hyeonuk
Jung, Youngbeom
Kim, Lee-Sup
IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (02) : 353 - 365
[4] BISWSRBS: AWinograd-based CNN Accelerator with a Fine-grained Regular Sparsity Pattern and Mixed PrecisionQuantization
Yang, Tao
He, Zhezhi
Kou, Tengchuan
Li, Qingzheng
Han, Qi
Yu, Haibao
Liu, Fangxin
Liang, Yun
Jiang, Li
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2021, 14 (04)
[5] A fine-grained mixed precision DNN accelerator using a two-stage big-little core RISC-V MCU
Zhang, Li
Lv, Qishen
Gao, Di
Zhou, Xian
Meng, Wenchao
Yang, Qinmin
Zhuo, Cheng
INTEGRATION-THE VLSI JOURNAL, 2023, 88 : 241 - 248

← 1 →