PIMCA: A Programmable In-Memory Computing Accelerator for Energy-Efficient DNN Inference

被引：17

作者：

Zhang, Bo ^{[1
]}

Yin, Shihui ^{[2
,3
]}

Kim, Minkyu ^{[2
]}

Saikia, Jyotishman ^{[2
]}

Kwon, Soonwan ^{[4
]}

Myung, Sungmeen ^{[4
]}

Kim, Hyunsoo ^{[4
]}

Kim, Sang Joon ^{[4
]}

Seo, Jae-Sun ^{[2
]}

Seok, Mingoo ^{[1
]}

机构：

[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA

[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA

[3] Huawei, Shenzhen 518129, Peoples R China

[4] Samsung Adv Inst Technol, Suwon 16678, South Korea

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2023年 / 58卷 / 05期

关键词：

Capacitive coupling computing; deep neural; network (DNN); in-memory computing (IMC); programmable; accelerator; SRAM MACRO; COMPUTATION;

D O I：

10.1109/JSSC.2022.3211290

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article presents a programmable in-memory computing accelerator (PIMCA) for low-precision (1-2 b) deep neural network (DNN) inference. The custom 10T1C bitcell in the in-memory computing (IMC) macro has four additional transistors and one capacitor to perform capacitive-couplingbased multiply and accumulation (MAC) in analog-mixed-signal (AMS) domain. A macro containing 256 x 128 bitcells can simultaneously activate all the rows, and as a result, it can perform a matrix-vector multiplication (VMM) in one cycle. PIMCA integrates 108 of such IMC static random-access memory (SRAM) macros with the custom six-stage pipeline and the custom instruction set architecture (ISA) for instructionlevel programmability. The results of IMC macros are fed to a single-instruction-multiple-data (SIMD) processor for other computations such as partial sum accumulation, max-pooling, activation functions, etc. To effectively use the IMC and SIMD datapath, we customize the ISA especially by adding hardware loop support, which reduces the program size by up to 73%. The accelerator is prototyped in a 28-nm technology, and integrates a total of 3.4-Mb IMC SRAM and 1.5-Mb off-the-shelf activation SRAM, demonstrating one of the largest IMC accelerators to date. It achieves the system-level energy efficiency of 437 TOPS/W and the peak throughput of 49 TOPS at the 42-MHz clock frequency and 1-V supply for the VGG9 and the ResNet-18 on the CIFAR-10 dataset.

引用

页码：1436 / 1449

页数：14

共 50 条

[1] Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks
Yin, Shihui
Jiang, Zhewei
Kim, Minkyu
Gupta, Tushar
Seok, Mingoo
Seo, Jae-Sun
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 48 - 61
[2] Energy-Efficient In-Memory Database Computing
Lehner, Wolfgang
[J]. DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 470 - 474
[3] Scalable and Programmable Neural Network Inference Accelerator Based on In-Memory Computing
Jia, Hongyang
Ozatay, Murat
Tang, Yinqi
Valavi, Hossein
Pathak, Rakshit
Lee, Jinseok
Verma, Naveen
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (01) : 198 - 211
[4] A Programmable Neural-Network Inference Accelerator Based on Scalable In-Memory Computing
Jia, Hongyang
Ozatay, Murat
Tang, Yinqi
Valavi, Hossein
Pathak, Rakshit
Lee, Jinseok
Verma, Naveen
[J]. 2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 : 236 - +
[5] Memristor Overwrite Logic (MOL) for Energy-Efficient In-Memory DNN
Ali, Khaled Alhaj
Rizk, Mostafa
Baghdadi, Amer
Diguet, Jean-Philippe
Jomaah, Jalal
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[6] An Energy-Efficient In-Memory Accelerator for Graph Construction and Updating
Chen, Mingkai
Liu, Cheng
Liang, Shengwen
He, Lei
Wang, Ying
Zhang, Lei
Li, Huawei
Li, Xiaowei
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (06) : 1781 - 1793
[7] An Energy-efficient Matrix Multiplication Accelerator by Distributed In-memory Computing on Binary RRAM Crossbar
Ni, Leibin
Wang, Yuhao
Yu, Hao
Yang, Wei
Weng, Chuliang
Zhao, Junfeng
[J]. 2016 21ST ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016, : 280 - 285
[8] Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference
Meng, Jian
Yeo, Injune
Shim, Wonbo
Yang, Li
Fan, Deliang
Yu, Shimeng
Seo, Jae-Sun
[J]. 2022 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2022,
[9] Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference
Meng, Jian
Yeo, Injune
Yang, Li
Fan, Deliang
Seo, Jae-sun
Yu, Shimeng
Shim, Wonbo
[J]. 2022 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2022,
[10] In-Memory Computing: Towards Energy-Efficient Artificial Intelligence
Le Gallo, Manuel
Sebastian, Abu
Eleftheriou, Evangelos
[J]. ERCIM NEWS, 2018, (115): : 44 - 45

← 1 2 3 4 5 →