A 1-16b Reconfigurable 80Kb 7T SRAM-Based Digital Near-Memory Computing Macro for Processing Neural Networks

被引：12

作者：

Kim, Hyunjoon ^{[1
]}

Mu, Junjie ^{[1
]}

Yu, Chengshuo ^{[1
]}

Kim, Tony Tae-Hyoung ^{[1
]}

Kim, Bongjin ^{[2
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

[2] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2023年 / 70卷 / 04期

关键词：

Random access memory; Computer architecture; Common Information Model (computing); Registers; Adders; Logic gates; Throughput; SRAM; vector matrix multiplication; multiply-and-accumulate; PIM; CIM; digital near-memory computing; IN-MEMORY; ARCHITECTURE; PRECISION;

D O I：

10.1109/TCSI.2022.3232648

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This work introduces a digital SRAM-based near memory compute macro for DNN inference, improving on-chip weight memory capacity and area efficiency compared to state-of-the-art digital computing-in-memory (CIM) macros. A 20 x 256.1-16b reconfigurable digital computing near memory (NM) macro is proposed, supporting a reconfigurable 1-16b precision through the bit-serial computing scheme and the weight and input gating architecture for sparsity-aware operations. Each reconfigurable column MAC comprises 16x custom designed 7T SRAM bitcells to store 1-16b weights, a conventional 6T SRAM for zero weight skip control, a bitwise multiplier, and a full adder with a register for partial-sum accumulations. 20x parallel partial-sum outputs are post-accumulated to generate a sub-partitioned output feature map, which will be concatenated to produce the final convolution result. Besides, pipelined array structure improves the throughput of the proposed macro. The proposed near-memory computing macro implements an 80Kb binary weight storage in a 0.473mm(2) die area using 65nm. It presents the area/energy efficiency of 4329-270.6 GOPS/mm(2) and 315.07-1.23TOPS/W at 1-16b precision.

引用

页码：1580 / 1590

页数：11

共 6 条

[1] Colonnade: A Reconfigurable SRAM-Based Digital Bit-Serial Compute-In-Memory Macro for Processing Neural Networks
Kim, Hyunjoon
Yoo, Taegeun
Kim, Tony Tae-Hyoung
Kim, Bongjin
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (07) : 2221 - 2233
[2] A 1-8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks
You, Heng
Li, Weijun
Shang, Delong
Zhou, Yumei
Qiao, Shushan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (04) : 1602 - 1614
[3] A Dual 7T SRAM-Based Zero-Skipping Compute-In-Memory Macro With 1-6b Binary Searching ADCs for Processing Quantized Neural Networks
Yu, Chengshuo
Jiang, Haoge
Mu, Junjie
Chai, Kevin Tshun Chuan
Kim, Tony Tae-Hyoung
Kim, Bongjin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (08) : 3672 - 3682
[4] A 1-16b Precision Reconfigurable Digital In-Memory Computing Macro Featuring Column-MAC Architecture and Bit-Serial Computation
Kim, Hyunjoon
Chen, Qian
Yoo, Taegeun
Kim, Tony Tae-Hyoung
Kim, Bongjin
IEEE 45TH EUROPEAN SOLID STATE CIRCUITS CONFERENCE (ESSCIRC 2019), 2019, : 345 - 348
[5] SRAM-Based In-Memory Computing Macro Featuring Voltage-Mode Accumulator and Row-by-Row ADC for Processing Neural Networks
Mu, Junjie
Kim, Hyunjoon
Kim, Bongjin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (06) : 2412 - 2422
[6] D6CIM: 60.4-TOPS/W, 1.46-TOPS/mm2, 1005-Kb/mm2 Digital 6T-SRAM- Based Compute-in-Memory Macro Supporting 1-to-8b Fixed-Point Arithmetic in 28-nm CMOS
Oh, Jonghyun
Lin, Chuan-Tung
Seok, Mingoo
IEEE 49TH EUROPEAN SOLID STATE CIRCUITS CONFERENCE, ESSCIRC 2023, 2023, : 413 - 416

← 1 →