AMAIX: A Generic Analytical Model for Deep Learning Accelerators

被引:2
|
作者
Juenger, Lukas [1 ]
Zurstrassen, Niko [1 ]
Kogel, Tim [2 ]
Keding, Holger [2 ]
Leupers, Rainer [1 ]
机构
[1] Rhein Westfal TH Aachen, Inst Commun Technol & Embedded Syst ICE, Aachen, Germany
[2] Synopsys GmbH, Aschheim, Germany
关键词
Deep Learning Accelerators; Analytical models; Design space exploration; Roofline model;
D O I
10.1007/978-3-030-60939-9_3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years the growing popularity of Convolutional Neural Networks (CNNs) has driven the development of specialized hardware, so called Deep Learning Accelerators (DLAs). The large market for DLAs and the huge amount of papers published on DLA design show that there is currently no one-size-fits-all solution. Depending on the given optimization goals such as power consumption or performance, there may be several optimal solutions for each scenario. A commonly used method for finding these solutions as early as possible in the design cycle, is the employment of analytical models which try to describe a design by simple yet insightful and sufficiently accurate formulas. The main contribution of this work is the generic Analytical Model for AI accelerators (AMAIX) for the estimation of CNN inference performance on DLAs. It is based on the popular Roofline model. To show the validity of our approach, AMAIX was applied to the Nvidia Deep Learning Accelerator (NVDLA) as a case study using the AlexNet and LeNet CNNs as workloads. The resulting performance predictions were verified against an RTL emulation of the NVDLA using a Synopsys ZeBu Server-based hybrid prototype. AMAIX predicted the inference time of AlexNet and LeNet on the NVDLA with an accuracy of up to 88% and 98% respectively. Furthermore, this work shows how to use the obtained results for root-cause analysis and as a starting point for design space exploration.
引用
收藏
页码:36 / 51
页数:16
相关论文
共 50 条
  • [41] Test and Yield Loss Reduction of AI and Deep Learning Accelerators
    Sadi, Mehdi
    Guin, Ujjwal
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (01) : 104 - 115
  • [42] SEALing Neural Network Models in Encrypted Deep Learning Accelerators
    Zuo, Pengfei
    Hua, Yu
    Liang, Ling
    Xie, Xinfeng
    Hu, Xing
    Xie, Yuan
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1255 - 1260
  • [43] Neutrons Sensitivity of Deep Reinforcement Learning Policies on EdgeAI Accelerators
    Bodmann, Pablo R.
    Saveriano, Matteo
    Kritikakou, Angeliki
    Rech, Paolo
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2024, 71 (08) : 1480 - 1486
  • [44] A Survey and Taxonomy of FPGA-based Deep Learning Accelerators
    Blaiech, Ahmed Ghazi
    Ben Khalifa, Khaled
    Valderrama, Carlos
    Fernandes, Marcelo A. C.
    Bedoui, Mohamed Hedi
    JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 98 : 331 - 345
  • [45] Analyzing and Mitigating Circuit Aging Effects in Deep Learning Accelerators
    Das, Sanjay
    Kundu, Shamik
    Menon, Anand
    Ren, Yihui
    Kharel, Shubha
    Basu, Kanad
    2024 IEEE 42ND VLSI TEST SYMPOSIUM, VTS 2024, 2024,
  • [46] A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads
    Emani, Murali
    Xie, Zhen
    Raskar, Siddhisanket
    Sastry, Varuni
    Arnold, William
    Wilson, Bruce
    Thakur, Rajeev
    Vishwanath, Venkatram
    Liu, Zhengchun
    Papka, Michael E.
    Bohorquez, Cindy Orozco
    Weisner, Rick
    Li, Karen
    Sheng, Yongning
    Du, Yun
    Zhang, Jian
    Tsyplikhin, Alexander
    Khaira, Gurdaman
    Fowers, Jeremy
    Sivakumar, Ramakrishnan
    Godsoe, Victoria
    Macias, Adrian
    Tekur, Chetan
    Boyd, Matthew
    2022 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS), 2022, : 13 - 25
  • [47] Deep Learning at Scale on NVIDIA V100 Accelerators
    Xu, Rengan
    Han, Frank
    Ta, Quy
    PROCEEDINGS OF 2018 IEEE/ACM PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2018), 2018, : 23 - 32
  • [48] FIdelity: Efficient Resilience Analysis Framework for Deep Learning Accelerators
    He, Yi
    Balaprakash, Prasanna
    Li, Yanjing
    2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020), 2020, : 270 - 281
  • [49] Co-designed Systems for Deep Learning Hardware Accelerators
    Brooks, David M.
    2018 INTERNATIONAL SYMPOSIUM ON VLSI TECHNOLOGY, SYSTEMS AND APPLICATION (VLSI-TSA), 2018,
  • [50] Kernel Mapping Techniques for Deep Learning Neural Network Accelerators
    Ozdemir, Sarp
    Khasawneh, Mohammad
    Rao, Smriti
    Madden, Patrick H.
    ISPD'22: PROCEEDINGS OF THE 2022 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, 2022, : 21 - 28