Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

被引：0

作者：

Luebeck, Konstantin ^{[1
]}

Jung, Alexander Louis-Ferdinand ^{[1
]}

Wedlich, Felix ^{[1
]}

Mueller, Mika Markus ^{[1
]}

Peccia, Federico Nicolas ^{[2
]}

Thoemmes, Felix ^{[2
]}

Steinmetz, Jannik ^{[1
]}

Biermaier, Valentin ^{[1
]}

Frischknecht, Adrian ^{[1
]}

Bernardo, Paul Palomero ^{[1
]}

Bringmann, Oliver ^{[1
]}

机构：

[1] Univ Tubingen, Embedded Syst, Tubingen, Baden Wurttembe, Germany

[2] FZI, Karlsruhe, Baden Wurttembe, Germany

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2025年 / 24卷 / 02期

关键词：

Deep neural networks; performance estimation; analytical model;

D O I：

10.1145/3715122

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared with simulation results, while being several magnitudes faster than an RTL simulation.

引用

页数：32

共 50 条

[21] CANN: Curable Approximations for High-Performance Deep Neural Network Accelerators
Hanif, Muhammad Abdullah
Khalid, Faiq
Shafique, Muhammad
PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[22] ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining
Mrazek, Vojtech
Vasicek, Zdenek
Sekanina, Lukas
Hanif, Muhammad Abdullah
Shafique, Muhammad
2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
[23] Fast Trajectory Generation with a Deep Neural Network for Hypersonic Entry Flight
Li, Haochen
Chen, Haibing
Tan, Chengpeng
Jiang, Zaiming
Xu, Xinyi
AEROSPACE, 2023, 10 (11)
[24] Fast and Accurate Performance Prediction and Optimization of Thermoelectric Generators with Deep Neural Networks
Wang, Pan
Wang, Kaifa
Xi, Li
Gao, Ruxin
Wang, Baolin
ADVANCED MATERIALS TECHNOLOGIES, 2021, 6 (07):
[25] A Survey on Memory Subsystems for Deep Neural Network Accelerators
Asad, Arghavan
Kaur, Rupinder
Mohammadi, Farah
FUTURE INTERNET, 2022, 14 (05):
[26] Dynamic Precision Multiplier For Deep Neural Network Accelerators
Ding, Chen
Yuxiang, Huan
Zheng, Lirong
Zou, Zhuo
2020 IEEE 33RD INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2020, : 180 - 184
[27] Multiple-Deep Neural Network Accelerators for Next-Generation Artificial Intelligence Systems
Venieris, Stylianos I.
Bouganis, Christos-Savvas
Lane, Nicholas D.
COMPUTER, 2023, 56 (03) : 70 - 79
[28] Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models
Samarakoon, Lahiru
Sim, Khe Chai
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1593 - 1597
[29] Performance Analysis of Deep Neural Models for Automatic Identification of disease status
Rajput, Kunal
Chetty, Girija
Davey, Rachel
2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, : 136 - 141
[30] Accurate and fast electrostatic simulation of double gate FETs using deep neural network
Singh, A. K.
Ashai, A.
Reddy, P. S. K.
Dhaaipule, S. A.
Sarkar, B.
Badami, O.
8TH IEEE ELECTRON DEVICES TECHNOLOGY & MANUFACTURING CONFERENCE, EDTM 2024, 2024, : 595 - 597

← 1 2 3 4 5 →