PIMCA: A Programmable In-Memory Computing Accelerator for Energy-Efficient DNN Inference

被引:17
|
作者
Zhang, Bo [1 ]
Yin, Shihui [2 ,3 ]
Kim, Minkyu [2 ]
Saikia, Jyotishman [2 ]
Kwon, Soonwan [4 ]
Myung, Sungmeen [4 ]
Kim, Hyunsoo [4 ]
Kim, Sang Joon [4 ]
Seo, Jae-Sun [2 ]
Seok, Mingoo [1 ]
机构
[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA
[3] Huawei, Shenzhen 518129, Peoples R China
[4] Samsung Adv Inst Technol, Suwon 16678, South Korea
关键词
Capacitive coupling computing; deep neural; network (DNN); in-memory computing (IMC); programmable; accelerator; SRAM MACRO; COMPUTATION;
D O I
10.1109/JSSC.2022.3211290
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article presents a programmable in-memory computing accelerator (PIMCA) for low-precision (1-2 b) deep neural network (DNN) inference. The custom 10T1C bitcell in the in-memory computing (IMC) macro has four additional transistors and one capacitor to perform capacitive-couplingbased multiply and accumulation (MAC) in analog-mixed-signal (AMS) domain. A macro containing 256 x 128 bitcells can simultaneously activate all the rows, and as a result, it can perform a matrix-vector multiplication (VMM) in one cycle. PIMCA integrates 108 of such IMC static random-access memory (SRAM) macros with the custom six-stage pipeline and the custom instruction set architecture (ISA) for instructionlevel programmability. The results of IMC macros are fed to a single-instruction-multiple-data (SIMD) processor for other computations such as partial sum accumulation, max-pooling, activation functions, etc. To effectively use the IMC and SIMD datapath, we customize the ISA especially by adding hardware loop support, which reduces the program size by up to 73%. The accelerator is prototyped in a 28-nm technology, and integrates a total of 3.4-Mb IMC SRAM and 1.5-Mb off-the-shelf activation SRAM, demonstrating one of the largest IMC accelerators to date. It achieves the system-level energy efficiency of 437 TOPS/W and the peak throughput of 49 TOPS at the 42-MHz clock frequency and 1-V supply for the VGG9 and the ResNet-18 on the CIFAR-10 dataset.
引用
收藏
页码:1436 / 1449
页数:14
相关论文
共 50 条
  • [1] Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks
    Yin, Shihui
    Jiang, Zhewei
    Kim, Minkyu
    Gupta, Tushar
    Seok, Mingoo
    Seo, Jae-Sun
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 48 - 61
  • [2] Energy-Efficient In-Memory Database Computing
    Lehner, Wolfgang
    [J]. DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 470 - 474
  • [3] Scalable and Programmable Neural Network Inference Accelerator Based on In-Memory Computing
    Jia, Hongyang
    Ozatay, Murat
    Tang, Yinqi
    Valavi, Hossein
    Pathak, Rakshit
    Lee, Jinseok
    Verma, Naveen
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (01) : 198 - 211
  • [4] A Programmable Neural-Network Inference Accelerator Based on Scalable In-Memory Computing
    Jia, Hongyang
    Ozatay, Murat
    Tang, Yinqi
    Valavi, Hossein
    Pathak, Rakshit
    Lee, Jinseok
    Verma, Naveen
    [J]. 2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 : 236 - +
  • [5] Memristor Overwrite Logic (MOL) for Energy-Efficient In-Memory DNN
    Ali, Khaled Alhaj
    Rizk, Mostafa
    Baghdadi, Amer
    Diguet, Jean-Philippe
    Jomaah, Jalal
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [6] An Energy-Efficient In-Memory Accelerator for Graph Construction and Updating
    Chen, Mingkai
    Liu, Cheng
    Liang, Shengwen
    He, Lei
    Wang, Ying
    Zhang, Lei
    Li, Huawei
    Li, Xiaowei
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (06) : 1781 - 1793
  • [7] An Energy-efficient Matrix Multiplication Accelerator by Distributed In-memory Computing on Binary RRAM Crossbar
    Ni, Leibin
    Wang, Yuhao
    Yu, Hao
    Yang, Wei
    Weng, Chuliang
    Zhao, Junfeng
    [J]. 2016 21ST ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016, : 280 - 285
  • [8] Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference
    Meng, Jian
    Yeo, Injune
    Shim, Wonbo
    Yang, Li
    Fan, Deliang
    Yu, Shimeng
    Seo, Jae-Sun
    [J]. 2022 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2022,
  • [9] Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference
    Meng, Jian
    Yeo, Injune
    Yang, Li
    Fan, Deliang
    Seo, Jae-sun
    Yu, Shimeng
    Shim, Wonbo
    [J]. 2022 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2022,
  • [10] In-Memory Computing: Towards Energy-Efficient Artificial Intelligence
    Le Gallo, Manuel
    Sebastian, Abu
    Eleftheriou, Evangelos
    [J]. ERCIM NEWS, 2018, (115): : 44 - 45