Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks

被引:38
|
作者
Yin, Shihui [1 ]
Jiang, Zhewei [2 ]
Kim, Minkyu [1 ]
Gupta, Tushar [3 ]
Seok, Mingoo [2 ]
Seo, Jae-Sun [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA
[2] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[3] Synopsys, Mountain View, CA 94043 USA
基金
美国国家科学基金会;
关键词
Deep learning accelerator; deep neural networks (DNNs); double-buffering; in-memory computing; SRAM;
D O I
10.1109/TVLSI.2019.2940649
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To enable essential deep learning computation on energy-constrained hardware platforms, including mobile, wearable, and Internet of Things (IoT) devices, a number of digital ASIC designs have presented customized dataflow and enhanced parallelism. However, in conventional digital designs, the biggest bottleneck for energy-efficient deep neural networks (DNNs) has reportedly been the data access and movement. To eliminate the storage access bottleneck, new SRAM macros that support in-memory computing have been recently demonstrated. Several in-SRAM computing works have used the mix of analog and digital circuits to perform XNOR-and-ACcumulate (XAC) operation without row-by-row memory access and can map a subset of DNNs with binary weights and binary activations. In the single array level, large improvement in energy efficiency (e.g., two orders of magnitude improvement) has been reported in computing XAC over digital-only hardware performing the same operation. In this article, by integrating many instances of such in-memory computing SRAM macros with an ensemble of peripheral digital circuits, we architect a new DNN accelerator, titled Vesti. This new accelerator is designed to support configurable multibit activations and large-scale DNNs seamlessly while substantially improving the chip-level energy-efficiency with favorable accuracy tradeoff compared to conventional digital ASIC. Vesti also employs double-buffering with two groups of in-memory computing SRAMs, effectively hiding the row-by-row write latencies of in-memory computing SRAMs. The Vesti accelerator is fully designed and laid out in 65-nm CMOS, demonstrating ultralow energy consumption of < 20 nJ for MNIST classification and < 40 mu J for CIFAR-10 classification at 1.0-V supply.
引用
收藏
页码:48 / 61
页数:14
相关论文
共 50 条
  • [1] Vesti: An In-Memory Computing Processor for Deep Neural Networks Acceleration
    Jiang, Zhewei
    Yin, Shihui
    Kim, Minkyu
    Gupta, Tushar
    Seok, Mingoo
    Seo, Jae-sun
    [J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1516 - 1521
  • [2] PIMCA: A Programmable In-Memory Computing Accelerator for Energy-Efficient DNN Inference
    Zhang, Bo
    Yin, Shihui
    Kim, Minkyu
    Saikia, Jyotishman
    Kwon, Soonwan
    Myung, Sungmeen
    Kim, Hyunsoo
    Kim, Sang Joon
    Seo, Jae-Sun
    Seok, Mingoo
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2023, 58 (05) : 1436 - 1449
  • [3] In-Memory Computing Based Hardware Accelerator Module for Deep Neural Networks
    Appukuttan, Allen
    Thomas, Emmanuel
    Nair, Harinandan R.
    Hemanth, S.
    Dhanaraj, K. J.
    Azeez, Maleeha Abdul
    [J]. 2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [4] Energy-Efficient In-Memory Database Computing
    Lehner, Wolfgang
    [J]. DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 470 - 474
  • [5] An Energy-Efficient In-Memory Accelerator for Graph Construction and Updating
    Chen, Mingkai
    Liu, Cheng
    Liang, Shengwen
    He, Lei
    Wang, Ying
    Zhang, Lei
    Li, Huawei
    Li, Xiaowei
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (06) : 1781 - 1793
  • [6] An Energy-Efficient and Flexible Accelerator based on Reconfigurable Computing for Multiple Deep Convolutional Neural Networks
    Yang, Chen
    Zhang, HaiBo
    Wang, XiaoLi
    Geng, Li
    [J]. 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1389 - 1391
  • [7] An Energy-efficient Matrix Multiplication Accelerator by Distributed In-memory Computing on Binary RRAM Crossbar
    Ni, Leibin
    Wang, Yuhao
    Yu, Hao
    Yang, Wei
    Weng, Chuliang
    Zhao, Junfeng
    [J]. 2016 21ST ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2016, : 280 - 285
  • [8] In-Memory Computing: Towards Energy-Efficient Artificial Intelligence
    Le Gallo, Manuel
    Sebastian, Abu
    Eleftheriou, Evangelos
    [J]. ERCIM NEWS, 2018, (115): : 44 - 45
  • [9] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel
    Sze, Vivienne
    [J]. 2016 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2016, 59 : 262 - U363
  • [10] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138