PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM

被引：47

作者：

Ankit, Aayush ^{[1
]}

El Hajj, Izzat ^{[2
]}

Chalamalasetti, Sai Rahul ^{[3
]}

Agarwal, Sapan ^{[4
]}

Marinella, Matthew ^{[4
]}

Foltin, Martin ^{[3
]}

Strachan, John Paul ^{[3
]}

Milojicic, Dejan ^{[3
]}

Hwu, Wen-Mei ^{[5
]}

Roy, Kaushik ^{[1
]}

机构：

[1] Purdue Univ, Dept Elect & Comp Engn, W Lafayette, IN 47907 USA

[2] Amer Univ Beirut, Dept Comp Sci, Beirut 11072020, Lebanon

[3] Hewlett Packard Labs, San Jose, CA 95002 USA

[4] Sandia Natl Labs, Livermore, CA 94550 USA

[5] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2020年 / 69卷 / 08期

关键词：

Accelerators; resistive random-access memory (ReRam); neural networks; training;

D O I：

10.1109/TC.2020.2998456

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix-vector multiplication operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step. A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connected layers only. To address these limitations, we propose a bit-slicing technique for enhancing the precision of ReRAM-based outer products, which is substantially different from bit-slicing for matrix-vector multiplication only. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. To evaluate our design on different types of layers in neural networks (fully-connected, convolutional, etc.) and training algorithms, we develop PANTHER, an ISA-programmable training accelerator with compiler support. Our design can also be integrated into other accelerators in the literature to enhance their efficiency. Our evaluation shows that PANTHER achieves up to 8.02x, 54.21x, and 103x energy reductions as well as 7.16x, 4.02x, and 16x execution time reductions compared to digital accelerators, ReRAM-based accelerators, and GPUs, respectively.

引用

页码：1128 / 1142

页数：15

共 50 条

[41] Energy-efficient cyber-physical production network: Architecture and technologies
Lu, Yugian
Peng, Tao
Xu, Xun
[J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2019, 129 : 56 - 66
[42] Energy-Efficient Optical Network-on-Chip Architecture for Heterogeneous Multicores
Van Winkle, Scott
Ditomaso, Dominic
Kennedy, Matthew
Kodi, Avinash
[J]. 2016 IEEE OPTICAL INTERCONNECTS CONFERENCE (OI), 2016, : 62 - 63
[43] GRaNADA: A Network-Aware and Energy-Efficient PaaS Cloud Architecture
Cuadrado-Cordero, Ismael
Orgerie, Anne-Cecile
Morin, Christine
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND DATA INTENSIVE SYSTEMS, 2015, : 412 - 419
[44] HREN: A Hybrid Reliable and Energy-Efficient Network-on-Chip Architecture
Bhamidipati, Padmaja
Karanth, Avinash
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (02) : 537 - 548
[45] Energy-Efficient Multiple Network-on-Chip Architecture With Bandwidth Expansion
Zhou, Wu
Ouyang, Yiming
Xu, Dongyu
Huang, Zhengfeng
Liang, Huaguo
Wen, Xiaoqing
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (04) : 442 - 455
[46] Energy-Efficient Virtual Network Embedding Algorithm Based on Hopfield Neural Network
He, Mengyang
Zhuang, Lei
Yang, Sijin
Zhang, Jianhui
Meng, Huiping
[J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
[47] TRAINING DEEP SPIKING NEURAL NETWORKS FOR ENERGY-EFFICIENT NEUROMORPHIC COMPUTING
Srinivasan, Gopalakrishnan
Lee, Chankyu
Sengupta, Abhronil
Panda, Priyadarshini
Sarwar, Syed Shakib
Roy, Kaushik
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8549 - 8553
[48] Rank order coding based spiking convolutional neural network architecture with energy-efficient membrane voltage updates
Tang, Hoyoung
Cho, Donghyeon
Lew, Dongwoo
Kim, Taehwan
Park, Jongsun
[J]. NEUROCOMPUTING, 2020, 407 : 300 - 312
[49] EnTiered-ReRAM: An Enhanced Low Latency and Energy Efficient TLC Crossbar ReRAM Architecture
Zhang, Yang
Yu, Zhibin
Gu, Liang
Wang, Chengning
Feng, Dan
[J]. IEEE ACCESS, 2021, 9 : 167173 - 167189
[50] Programmable Energy-Efficient Analog Multilayer Perceptron Architecture Suitable for Future Expansion to Hardware Accelerators
Dix, Jeff
Holleman, Jeremy
Blalock, Benjamin J.
[J]. JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2023, 13 (03)

← 1 2 3 4 5 →