Efficient compiler code generation for Deep Learning Snowflake co-processor

被引:0
|
作者
Chang, Andre Xian Ming [1 ]
Zaidy, Aliasger [1 ]
Culurciello, Eugenio [1 ]
机构
[1] FWDNXT, W Lafayette, IN 47906 USA
关键词
Deep learning; neural networks; co-processor; compiler;
D O I
10.1109/EMC2.2018.00013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) are widely used in various applications including image classification, semantic segmentation and natural language processing. Various DNN models were developed to achieve high accuracy on different tasks. Efficiently mapping the workflow of those models onto custom accelerators requires a programmable hardware and a custom compiler. In this work, we use Snowflake, which is a programmable DNN targeted accelerator. We also present a compiler that correctly generated code for Snowflake. Our system were evaluated on various convolution layers present in AlexNet, ResNet and LightCNN. Snowflake with 256 processing units was implemented on Xilinx FPGA, and it achieved 70 frames/s for AlexNet without linear layers.
引用
收藏
页码:24 / 28
页数:5
相关论文
共 50 条
  • [31] Compiler Technologies in Deep Learning Co-Design: A Survey
    Zhang, Hongbin
    Xing, Mingjie
    Wu, Yanjun
    Zhao, Chen
    Intelligent Computing, 2023, 2
  • [32] Towards Efficient Elastic Parallelism for Deep Learning Processor
    Cheng, Jinyu
    Qian, Ruyi
    Shi, Qinwen
    Hu, Gaomei
    Ciao, Mengjuan
    Huo, Qirun
    Xu, Yuanchao
    2022 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING, ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM, 2022, : 363 - 370
  • [33] A Deep Learning Model for Source Code Generation
    Tiwang, Raymond
    Oladunni, Timothy
    Xu, Weifeng
    2019 IEEE SOUTHEASTCON, 2019,
  • [34] An efficient RTL-based code generation for specified DSP C-compiler
    Pan, QH
    Liu, P
    Shi, C
    Yao, QD
    Zhu, SB
    Yan, L
    Zhou, Y
    Huang, WB
    MEDIA PROCESSORS 2002, 2002, 4674 : 141 - 149
  • [35] An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN Inference on Mobile Devices
    Elbtity, Mohammed
    Singh, Abhishek
    Reidy, Brendan
    Guo, Xiaochen
    Zand, Ramtin
    2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 188 - 193
  • [36] Identifying Compiler and Optimization Options from Binary Code using Deep Learning Approaches
    Pizzolotto, Davide
    Inoue, Katsuro
    2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 232 - 242
  • [37] An Efficient Hardware Architecture for Activation Function in Deep Learning Processor
    Li, Lin
    Zhang, Shengbing
    Wu, Juan
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 911 - 918
  • [38] Generation of the Single Precision BLAS library for the Parallella platform, with Epiphany co-processor acceleration, using the BLIS framework
    Tasende, Miguel
    2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 894 - 897
  • [39] An Efficient OpenCL-Based Implementation of a SHA-3 Co-Processor on an FPGA-Centric Platform
    Bensalem, Hachem
    Blaquiere, Yves
    Savaria, Yvon
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (03) : 1144 - 1148
  • [40] An FPGA-Based Co-Processor for Spiking Neural Networks with On-Chip STDP-Based Learning
    Nguyen, Thao N. N.
    Veeravalli, Bharadwaj
    Fong, Xuanyao
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2157 - 2161