AMAIX: A Generic Analytical Model for Deep Learning Accelerators

被引:2
|
作者
Juenger, Lukas [1 ]
Zurstrassen, Niko [1 ]
Kogel, Tim [2 ]
Keding, Holger [2 ]
Leupers, Rainer [1 ]
机构
[1] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst ICE, Aachen, Germany
[2] Synopsys GmbH, Aschheim, Germany
关键词
Deep Learning Accelerators; Analytical models; Design space exploration; Roofline model;
D O I
10.1007/978-3-030-60939-9_3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years the growing popularity of Convolutional Neural Networks (CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerators (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN inference performance on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy of up to 88% and 98% respectively. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.
引用
收藏
页码:36 / 51
页数:16
相关论文
共 50 条
  • [21] Learning to Design Accurate Deep Learning Accelerators with Inaccurate Multipliers
    Jain, Paras
    Huda, Safeen
    Maas, Martin
    Gonzalez, Joseph E.
    Stoica, Ion
    Mirhoseini, Azalia
    PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 184 - 189
  • [22] A generic energy prediction model of machine tools using deep learning algorithms
    He, Yan
    Wu, Pengcheng
    Li, Yufeng
    Wang, Yulin
    Tao, Fei
    Wang, Yan
    APPLIED ENERGY, 2020, 275
  • [23] FlexPDA: A Flexible Programming Framework for Deep Learning Accelerators
    Liu, Lei
    Ma, Xiu
    Liu, Hua-Xiao
    Li, Guang-Li
    Liu, Lei
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (05) : 1200 - 1220
  • [24] FlexPDA: A Flexible Programming Framework for Deep Learning Accelerators
    Lei Liu
    Xiu Ma
    Hua-Xiao Liu
    Guang-Li Li
    Lei Liu
    Journal of Computer Science and Technology, 2022, 37 : 1200 - 1220
  • [25] Evaluating Embedded FPGA Accelerators for Deep Learning Applications
    Hegde, Gopalakrishna
    Siddhartha
    Ramasamy, Nachiappan
    Buddha, Vamsi
    Kapre, Nachiket
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 25 - 25
  • [26] AccPar: Tensor Partitioning for Heterogeneous Deep Learning Accelerators
    Song, Linghao
    Chen, Fan
    Zhuo, Youwei
    Qian, Xuehai
    Li, Hai
    Chen, Yiran
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 342 - 355
  • [27] A review of emerging trends in photonic deep learning accelerators
    Atwany, Mohammad
    Pardo, Sarah
    Serunjogi, Solomon
    Rasras, Mahmoud
    FRONTIERS IN PHYSICS, 2024, 12
  • [28] Survey of Deep Learning Accelerators for Edge and Emerging Computing
    Alam, Shahanur
    Yakopcic, Chris
    Wu, Qing
    Barnell, Mark
    Khan, Simon
    Taha, Tarek M.
    ELECTRONICS, 2024, 13 (15)
  • [29] Deep learning in analytical chemistry
    Debus, Bruno
    Parastar, Hadi
    Harrington, Peter
    Kirsanov, Dmitry
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2021, 145
  • [30] QuiltNet: Efficient Deep Learning Inference on Multi-Chip Accelerators Using Model Partitioning
    Park, Jongho
    Kwon, HyukJun
    Kim, Seowoo
    Lee, Junyoung
    Ha, Minho
    Lim, Euicheol
    Imani, Mohsen
    Kim, Yeseong
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1159 - 1164