Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

被引：1

作者：

Pal, Subhankar ^{[1
]}

Venkataramani, Swagath ^{[2
]}

Srinivasan, Viji ^{[2
]}

Gopalakrishnan, Kailash ^{[2
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年

关键词：

PERFORMANCE;

D O I：

10.1109/ISPASS51385.2021.00046

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A prevalent challenge for Deep Learning (DL) accelerators is how they are programmed to sustain utilization without impacting end-user productivity. Little prior effort has been devoted to the effective management of their on-chip Scratch-Pad Memory (SPM) across the DL operations of a Deep Neural Network (DNN). This is especially critical due to trends in complex network topologies and the emergence of eager execution. This work demonstrates that there exists up to a 5.2x performance gap in DL inference to be bridged using SPM management, on a set of image, object and language networks. We propose OnSRAM, a novel SPM management framework integrated with a DL accelerator runtime. OnSRAM has two variants, viz. OnSRAM-Static, which works on static graphs to identify data structures that should be held on-chip based on their properties, and OnSRAM-Eager, which targets an eager execution model (no graph) and uses a speculative scheme to hold/discard data structures. On a prototypical DL accelerator, OnSRAM-Static and OnSRAM-Eager achieve reductions in inference latency (batch size of 1) of 1.02-4.8x and 1.02-3.1x, respectively, over a baseline with no SPM management.

引用

页码：240 / 242

页数：3

共 50 条

[41] An integrated scratch-pad allocator for affine and non-affine code
Udayakumaran, Sumesh
Barua, Rajeev
2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 923 - +
[42] Automatic Analysis of Scratch-Pad Memory Code for Heterogeneous Multicore Processors
Donaldson, Alastair F.
Kroening, Daniel
Ruemmer, Philipp
TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, PROCEEDINGS, 2010, 6015 : 280 - 295
[43] Optimizing Data Distribution for Loops on Embedded Multicore with Scratch-Pad Memory
Gao, Qiuyan
Zhuge, Qingfeng
Zhang, Jun
Zhu, Guanyu
Sha, Edwin H. -M.
JOURNAL OF COMPUTERS, 2014, 9 (05) : 1066 - 1076
[44] Energy efficiency of scratch-pad memory at 65 nm and below: An empirical study
Takase, Hideki
Tomiyama, Hiroyuki
Zeng, Gang
Takada, Hiroaki
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2008, : 93 - 97
[45] Analysis of scratch-pad and data-cache performance using statistical methods
Absar, Javed
Catthoor, Francky
ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 2006, : 820 - 825
[46] ISOS: Space Overlapping Based on Iteration Access Patterns for Dynamic Scratch-pad Memory Management in Embedded Systems
Yang, Yanqin
Shao, Zili
Pan, Linfeng
Guo, Minyi
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE FOR YOUNG COMPUTER SCIENTISTS, VOLS 1-5, 2008, : 1360 - +
[47] MCAMP: Communication Optimization on Massively Parallel Machines with Hierarchical Scratch-pad Memory
Hayashizaki, Hiroshige
Sugawara, Yutaka
Inaba, Mary
Hiraki, Kei
PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, : 102 - 111
[48] The Energy Optimization for Architectures with Limited Addressing Modes Using Scratch-Pad Memory
Ling Ming
Zhang Yang
Mei Chen
Pu Hanlai
CHINESE JOURNAL OF ELECTRONICS, 2010, 19 (04): : 637 - 640
[49] Scratch-Pad Memory Banking for Energy Reduction in Embedded Signal Processing Systems
Balasa, Florin
Luican, Ilie I.
Gingu, Cristian V.
2013 IEEE 56TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2013, : 844 - 847
[50] Pretenuring in Java']Java by object lifetime and reference density using scratch-pad memory
Chong, K. F.
Ho, C. Y.
Fong, Anthony S.
15TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PROCEEDINGS, 2007, : 205 - +

← 1 2 3 4 5 →