AMAIX: A Generic Analytical Model for Deep Learning Accelerators

被引:2
|
作者
Juenger, Lukas [1 ]
Zurstrassen, Niko [1 ]
Kogel, Tim [2 ]
Keding, Holger [2 ]
Leupers, Rainer [1 ]
机构
[1] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst ICE, Aachen, Germany
[2] Synopsys GmbH, Aschheim, Germany
关键词
Deep Learning Accelerators; Analytical models; Design space exploration; Roofline model;
D O I
10.1007/978-3-030-60939-9_3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years the growing popularity of Convolutional Neural Networks (CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerators (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN inference performance on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy of up to 88% and 98% respectively. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.
引用
收藏
页码:36 / 51
页数:16
相关论文
共 50 条
  • [31] Deep Learning for Generic Object Detection: A Survey
    Liu, Li
    Ouyang, Wanli
    Wang, Xiaogang
    Fieguth, Paul
    Chen, Jie
    Liu, Xinwang
    Pietikainen, Matti
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (02) : 261 - 318
  • [32] Deep Learning for Generic Object Detection: A Survey
    Li Liu
    Wanli Ouyang
    Xiaogang Wang
    Paul Fieguth
    Jie Chen
    Xinwang Liu
    Matti Pietikäinen
    International Journal of Computer Vision, 2020, 128 : 261 - 318
  • [33] Improving Deep Learning with Generic Data Augmentation
    Taylor, Luke
    Nitschke, Geoff
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1542 - 1547
  • [34] A radiative transfer deep learning model coupled into WRF with a generic fortran torch adaptor
    Mu, Bin
    Chen, Lu
    Yuan, Shijin
    Qin, Bo
    FRONTIERS IN EARTH SCIENCE, 2023, 11
  • [35] Delfos: deep learning model for prediction of solvation free energies in generic organic solvents
    Lim, Hyuntae
    Jung, YounJoon
    CHEMICAL SCIENCE, 2019, 10 (36) : 8306 - 8315
  • [36] FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review
    Shawahna, Ahmad
    Sait, Sadiq M.
    El-Maleh, Aiman
    IEEE ACCESS, 2019, 7 : 7823 - 7859
  • [37] Deep Learning Inferencing with High-performance Hardware Accelerators
    Kljucaric, Luke
    George, Alan D.
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (04)
  • [38] Decomposable Architecture and Fault Mitigation Methodology for Deep Learning Accelerators
    Huang, Ning-Chi
    Yang, Min-Syue
    Chang, Ya-Chu
    Wu, Kai-Chiang
    2023 24TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED, 2023, : 298 - 305
  • [39] The Progress and Trends of FPGA-Based Accelerators in Deep Learning
    Wu Y.-X.
    Liang K.
    Liu Y.
    Cui H.-M.
    Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (11): : 2461 - 2480
  • [40] Co-designed Systems for Deep Learning Hardware Accelerators
    Brooks, David M.
    2018 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2018,