Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

被引:0
|
作者
Luebeck, Konstantin [1 ]
Jung, Alexander Louis-Ferdinand [1 ]
Wedlich, Felix [1 ]
Mueller, Mika Markus [1 ]
Peccia, Federico Nicolas [2 ]
Thoemmes, Felix [2 ]
Steinmetz, Jannik [1 ]
Biermaier, Valentin [1 ]
Frischknecht, Adrian [1 ]
Bernardo, Paul Palomero [1 ]
Bringmann, Oliver [1 ]
机构
[1] Univ Tubingen, Embedded Syst, Tubingen, Baden Wurttembe, Germany
[2] FZI, Karlsruhe, Baden Wurttembe, Germany
关键词
Deep neural networks; performance estimation; analytical model;
D O I
10.1145/3715122
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared with simulation results, while being several magnitudes faster than an RTL simulation.
引用
收藏
页数:32
相关论文
共 50 条
  • [31] Multi-Scale Attention Deep Neural Network for Fast Accurate Object Detection
    Song, Kaiyou
    Yang, Hua
    Yin, Zhouping
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 2972 - 2985
  • [32] A simplified approach using deep neural network for fast and accurate shape from focus
    Mutahira, Husna
    Muhammad, Mannan Saeed
    Li, Mikhail
    Shin, Dong-Ryeol
    MICROSCOPY RESEARCH AND TECHNIQUE, 2021, 84 (04) : 656 - 667
  • [33] FlowDNN: a physics-informed deep neural network for fast and accurate flow prediction
    Chen, Donglin
    Gao, Xiang
    Xu, Chuanfu
    Wang, Siqi
    Chen, Shizhao
    Fang, Jianbin
    Wang, Zheng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (02) : 207 - 219
  • [34] High performance accelerators for deep neural networks: A review
    Akhoon, Mohd Saqib
    Suandi, Shahrel A.
    Alshahrani, Abdullah
    Saad, Abdul-Malik H. Y.
    Albogamy, Fahad R.
    Bin Abdullah, Mohd Zaid
    Loan, Sajad A.
    EXPERT SYSTEMS, 2022, 39 (01)
  • [35] An Empirical Study on Deep Neural Network Models for Chinese Dialogue Generation
    Li, Zhe
    Maimaiti, Mieradilijiang
    Sheng, Jiabao
    Ke, Zunwang
    Silamu, Wushour
    Wang, Qinyong
    Li, Xiuhong
    SYMMETRY-BASEL, 2020, 12 (11): : 1 - 16
  • [36] Accurate and fast replication on the generation of fractal network traffic using alternative probability models
    Fernandes, S
    Kamienski, C
    Sadok, D
    PERFORMANCE AND CONTROL OF NEXT GENERATION COMMUNICATION NETWORKS, 2003, 5244 : 154 - 163
  • [37] Automated optimization for memory-efficient high-performance deep neural network accelerators
    Kim, HyunMi
    Lyuh, Chun-Gi
    Kwon, Youngsu
    ETRI JOURNAL, 2020, 42 (04) : 505 - 517
  • [38] A Fast Multi-Loss Learning Deep Neural Network for Automatic Modulation Classification
    Chang, Shuo
    Yang, Zheng
    He, Jiashuo
    Li, Rong
    Huang, Sai
    Feng, Zhiyong
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2023, 9 (06) : 1503 - 1518
  • [39] Automatic and Accurate Sleep Stage Classification via a Convolutional Deep Neural Network and Nanomembrane Electrodes
    Kwon, Kangkyu
    Kwon, Shinjae
    Yeo, Woon-Hong
    BIOSENSORS-BASEL, 2022, 12 (03):
  • [40] An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators
    Nabavinejad, Seyed Morteza
    Baharloo, Mohammad
    Chen, Kun-Chih
    Palesi, Maurizio
    Kogel, Tim
    Ebrahimi, Masoumeh
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2020, 10 (03) : 268 - 282