FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

被引:35
|
作者
Yuan, Geng [1 ]
Behnam, Payman [2 ]
Li, Zhengang [1 ]
Shafiee, Ali [3 ]
Lin, Sheng [1 ]
Ma, Xiaolong [1 ]
Liu, Hang [4 ]
Qian, Xuehai [5 ]
Bojnordi, Mahdi Nazm [6 ]
Wang, Yanzhi [1 ]
Ding, Caiwen [7 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
[3] Samsung, Seoul, South Korea
[4] Stevens Inst Technol, Hoboken, NJ 07030 USA
[5] Univ Southern Calif, Los Angeles, CA 90089 USA
[6] Univ Utah, Salt Lake City, UT 84112 USA
[7] Univ Connecticut, Storrs, CT USA
来源
2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021) | 2021年
基金
美国国家科学基金会;
关键词
D O I
10.1109/ISCA52012.2021.00029
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent work demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication-the intensive and key computation in deep neural networks (DNNs). One key problem is the weights that are signed values. However, in a ReRAM crossbar, weights are stored as conductance of the crossbar cells, and the in-situ computation assumes all cells on each crossbar column are of the same sign. The current architectures either use two ReRAM crossbars for positive and negative weights (PRIME), or add an offset to weights so that all values become positive (ISAAC). Neither solution is ideal: they either double the cost of crossbars, or incur extra offset circuity. To better address this problem, we propose FORMS, a fine-grained ReRAM-based DNN accelerator with algorithm/hardware co-design. Instead of trying to represent the positive/negative weights, our key design principle is to enforce exactly what is assumed in the in-situ computation-ensuring that all weights in the same column of a crossbar have the same sign. It naturally avoids the cost of an additional crossbar. Such polarized weights can be nicely generated using alternating direction method of multipliers (ADMM) regularized optimization during the DNN training, which can exactly enforce certain patterns in DNN weights. To achieve high accuracy, we divide the crossbar into logical sub-arrays and only enforce this property within the fine-grained sub-array columns Crucially, the small sub-arrays provides a unique opportunity for input zero-skipping, which can significantly avoid unnecessary computations and reduce computation time. At the same time, it also makes the hardware much easier to implement and is less susceptible to non-idealities and noise than coarse-grained architectures. Putting all together, with the same optimized DNN models, FORMS achieves 1.50x and 1.93 x throughput improvement in terms of GOPs/epsilon x mm(2) and GOPs/W compared to ISAAC, and 1.12x similar to 2.4x speed up in terms of frame per second over optimized ISAAC with almost the same power/area cost. Interestingly, FORMS optimization framework can even speed up the original ISAAC from 10.7 x up to 377.9 x, reflecting the importance of software/hardware co-design optimizations.
引用
收藏
页码:265 / 278
页数:14
相关论文
共 5 条
  • [1] ReRAM-Sharing: Fine-Grained Weight Sharing for ReRAM-Based Deep Neural Network Accelerator
    Song, Zhuoran
    Li, Dongyue
    He, Zhezhi
    Liang, Xiaoyao
    Jiang, Li
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [2] ERA-BS: Boosting the Efficiency of ReRAM-Based PIM Accelerator With Fine-Grained Bit-Level Sparsity
    Liu, Fangxin
    Zhao, Wenbo
    Wang, Zongwu
    Chen, Yongbiao
    Liang, Xiaoyao
    Jiang, Li
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (09) : 2320 - 2334
  • [3] ADC-Free ReRAM-Based In-Situ Accelerator for Energy-Efficient Binary Neural Networks
    Kim, Hyeonuk
    Jung, Youngbeom
    Kim, Lee-Sup
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (02) : 353 - 365
  • [4] BISWSRBS: AWinograd-based CNN Accelerator with a Fine-grained Regular Sparsity Pattern and Mixed PrecisionQuantization
    Yang, Tao
    He, Zhezhi
    Kou, Tengchuan
    Li, Qingzheng
    Han, Qi
    Yu, Haibao
    Liu, Fangxin
    Liang, Yun
    Jiang, Li
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2021, 14 (04)
  • [5] A fine-grained mixed precision DNN accelerator using a two-stage big-little core RISC-V MCU
    Zhang, Li
    Lv, Qishen
    Gao, Di
    Zhou, Xian
    Meng, Wenchao
    Yang, Qinmin
    Zhuo, Cheng
    INTEGRATION-THE VLSI JOURNAL, 2023, 88 : 241 - 248