Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

被引:0
|
作者
Luebeck, Konstantin [1 ]
Jung, Alexander Louis-Ferdinand [1 ]
Wedlich, Felix [1 ]
Mueller, Mika Markus [1 ]
Peccia, Federico Nicolas [2 ]
Thoemmes, Felix [2 ]
Steinmetz, Jannik [1 ]
Biermaier, Valentin [1 ]
Frischknecht, Adrian [1 ]
Bernardo, Paul Palomero [1 ]
Bringmann, Oliver [1 ]
机构
[1] Univ Tubingen, Embedded Syst, Tubingen, Baden Wurttembe, Germany
[2] FZI, Karlsruhe, Baden Wurttembe, Germany
关键词
Deep neural networks; performance estimation; analytical model;
D O I
10.1145/3715122
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared with simulation results, while being several magnitudes faster than an RTL simulation.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Work-in-Progress: Ultra-fast yet Accurate Performance Prediction for Deep Neural Network Accelerators
    Luebeck, Konstantin
    Jung, Alexander Louis-Ferdinand
    Wedlich, Felix
    Bringmann, Oliver
    2022 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURE, AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES 2022), 2022, : 27 - 28
  • [2] Automatic Kernel Generation for Large Language Models on Deep Learning Accelerators
    Wang, Fuyu
    Shen, Minghua
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [3] Fast Loosely-Timed Deep Neural Network Models with Accurate Memory Contention
    Arasteh, Emad M.
    Domer, Rainer
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (05)
  • [4] SEALing Neural Network Models in Encrypted Deep Learning Accelerators
    Zuo, Pengfei
    Hua, Yu
    Liang, Ling
    Xie, Xinfeng
    Hu, Xing
    Xie, Yuan
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1255 - 1260
  • [5] Automatic Tool for Fast Generation of Custom Convolutional Neural Networks Accelerators for FPGA
    Rivera-Acosta, Miguel
    Ortega-Cisneros, Susana
    Rivera, Jorge
    ELECTRONICS, 2019, 8 (06)
  • [6] Joint Protection Scheme for Deep Neural Network Hardware Accelerators and Models
    Zhou, Jingbo
    Zhang, Xinmiao
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (12) : 4518 - 4527
  • [7] SHE: A Fast and Accurate Deep Neural Network for Encrypted Data
    Lou, Qian
    Jiang, Lei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] Fast Inner-Product Algorithms and Architectures for Deep Neural Network Accelerators
    Pogue, Trevor E.
    Nicolici, Nicola
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (02) : 495 - 509
  • [9] Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks
    Esmaeili, Seyed A.
    Singh, Bharat
    Davis, Larry S.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4178 - 4186
  • [10] NNSim: A Past and Accurate SystemC/TLM Simulator for Deep Convolutional Neural Network Accelerators
    Lee, Yi-Che
    Hsu, Ting-Shuo
    Chen, Chun-Tse
    Liou, Jing-Jia
    Lu, Juin-Ming
    2019 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2019,