AMAIX: A Generic Analytical Model for Deep Learning Accelerators

被引：2

作者：

Juenger, Lukas ^{[1
]}

Zurstrassen, Niko ^{[1
]}

Kogel, Tim ^{[2
]}

Keding, Holger ^{[2
]}

Leupers, Rainer ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst ICE, Aachen, Germany

[2] Synopsys GmbH, Aschheim, Germany

来源：

EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2020 | 2020年 / 12471卷

关键词：

Deep Learning Accelerators; Analytical models; Design space exploration; Roofline model;

D O I：

10.1007/978-3-030-60939-9_3

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years the growing popularity of Convolutional Neural Networks (CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerators (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN inference performance on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy of up to 88% and 98% respectively. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.

引用

页码：36 / 51

页数：16

共 50 条

[1] AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators
Niko Zurstraßen
Lukas Jünger
Tim Kogel
Holger Keding
Rainer Leupers
International Journal of Parallel Programming, 2022, 50 : 295 - 318
[2] AMAIX In-Depth: A Generic Analytical Model for Deep Learning Accelerators
Zurstrassen, Niko
Juenger, Lukas
Kogel, Tim
Keding, Holger
Leupers, Rainer
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2022, 50 (02) : 295 - 318
[3] Parallelism in Deep Learning Accelerators
Song, Linghao
Chen, Fan
Chen, Yiran
Li, Hai Helen
2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020, 2020, : 645 - 650
[4] A Generic Performance Model for Deep Learning in a Distributed Environment
Kavarakuntla, Tulasi
Han, Liangxiu
Lloyd, Huw
Latham, Annabel
Kleerekoper, Anthony
Akintoye, Samson B.
IEEE ACCESS, 2024, 12 : 8207 - 8219
[5] Ergodic Approximate Deep Learning Accelerators
van Lijssel, Tim
Balatsoukas-Stimming, Alexios
FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 734 - 738
[6] Hardware Accelerators for Deep Reinforcement Learning
Mishra, Vinod K.
Basu, Kanad
Arunachalam, Ayush
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS V, 2023, 12538
[7] AdequateDL: Approximating Deep Learning Accelerators
Sentieys, Olivier
Filip, Silviu
Briand, David
Novo, David
Dupuis, Etienne
O'Connor, Ian
Bosio, Alberto
2021 24TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS), 2021, : 37 - 40
[8] Orthogonal Neural Network: An Analytical Model for Deep Learning
Pan, Yonghao
Yu, Hongtao
Li, Shaomei
Huang, Ruiyang
APPLIED SCIENCES-BASEL, 2024, 14 (04):
[9] Int-Monitor: a model triggered hardware trojan in deep learning accelerators
Li, Peng
Hou, Rui
JOURNAL OF SUPERCOMPUTING, 2023, 79 (03): : 3095 - 3111
[10] Int-Monitor: a model triggered hardware trojan in deep learning accelerators
Peng Li
Rui Hou
The Journal of Supercomputing, 2023, 79 : 3095 - 3111

← 1 2 3 4 5 →