PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM

被引:47
|
作者
Ankit, Aayush [1 ]
El Hajj, Izzat [2 ]
Chalamalasetti, Sai Rahul [3 ]
Agarwal, Sapan [4 ]
Marinella, Matthew [4 ]
Foltin, Martin [3 ]
Strachan, John Paul [3 ]
Milojicic, Dejan [3 ]
Hwu, Wen-Mei [5 ]
Roy, Kaushik [1 ]
机构
[1] Purdue Univ, Dept Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Amer Univ Beirut, Dept Comp Sci, Beirut 11072020, Lebanon
[3] Hewlett Packard Labs, San Jose, CA 95002 USA
[4] Sandia Natl Labs, Livermore, CA 94550 USA
[5] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA
关键词
Accelerators; resistive random-access memory (ReRam); neural networks; training;
D O I
10.1109/TC.2020.2998456
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix-vector multiplication operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step. A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connected layers only. To address these limitations, we propose a bit-slicing technique for enhancing the precision of ReRAM-based outer products, which is substantially different from bit-slicing for matrix-vector multiplication only. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. To evaluate our design on different types of layers in neural networks (fully-connected, convolutional, etc.) and training algorithms, we develop PANTHER, an ISA-programmable training accelerator with compiler support. Our design can also be integrated into other accelerators in the literature to enhance their efficiency. Our evaluation shows that PANTHER achieves up to 8.02x, 54.21x, and 103x energy reductions as well as 7.16x, 4.02x, and 16x execution time reductions compared to digital accelerators, ReRAM-based accelerators, and GPUs, respectively.
引用
收藏
页码:1128 / 1142
页数:15
相关论文
共 50 条
  • [41] Energy-efficient cyber-physical production network: Architecture and technologies
    Lu, Yugian
    Peng, Tao
    Xu, Xun
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2019, 129 : 56 - 66
  • [42] Energy-Efficient Optical Network-on-Chip Architecture for Heterogeneous Multicores
    Van Winkle, Scott
    Ditomaso, Dominic
    Kennedy, Matthew
    Kodi, Avinash
    [J]. 2016 IEEE OPTICAL INTERCONNECTS CONFERENCE (OI), 2016, : 62 - 63
  • [43] GRaNADA: A Network-Aware and Energy-Efficient PaaS Cloud Architecture
    Cuadrado-Cordero, Ismael
    Orgerie, Anne-Cecile
    Morin, Christine
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND DATA INTENSIVE SYSTEMS, 2015, : 412 - 419
  • [44] HREN: A Hybrid Reliable and Energy-Efficient Network-on-Chip Architecture
    Bhamidipati, Padmaja
    Karanth, Avinash
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (02) : 537 - 548
  • [45] Energy-Efficient Multiple Network-on-Chip Architecture With Bandwidth Expansion
    Zhou, Wu
    Ouyang, Yiming
    Xu, Dongyu
    Huang, Zhengfeng
    Liang, Huaguo
    Wen, Xiaoqing
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (04) : 442 - 455
  • [46] Energy-Efficient Virtual Network Embedding Algorithm Based on Hopfield Neural Network
    He, Mengyang
    Zhuang, Lei
    Yang, Sijin
    Zhang, Jianhui
    Meng, Huiping
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [47] TRAINING DEEP SPIKING NEURAL NETWORKS FOR ENERGY-EFFICIENT NEUROMORPHIC COMPUTING
    Srinivasan, Gopalakrishnan
    Lee, Chankyu
    Sengupta, Abhronil
    Panda, Priyadarshini
    Sarwar, Syed Shakib
    Roy, Kaushik
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8549 - 8553
  • [48] Rank order coding based spiking convolutional neural network architecture with energy-efficient membrane voltage updates
    Tang, Hoyoung
    Cho, Donghyeon
    Lew, Dongwoo
    Kim, Taehwan
    Park, Jongsun
    [J]. NEUROCOMPUTING, 2020, 407 : 300 - 312
  • [49] EnTiered-ReRAM: An Enhanced Low Latency and Energy Efficient TLC Crossbar ReRAM Architecture
    Zhang, Yang
    Yu, Zhibin
    Gu, Liang
    Wang, Chengning
    Feng, Dan
    [J]. IEEE ACCESS, 2021, 9 : 167173 - 167189
  • [50] Programmable Energy-Efficient Analog Multilayer Perceptron Architecture Suitable for Future Expansion to Hardware Accelerators
    Dix, Jeff
    Holleman, Jeremy
    Blalock, Benjamin J.
    [J]. JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2023, 13 (03)