Efficient compiler code generation for Deep Learning Snowflake co-processor

被引：0

作者：

Chang, Andre Xian Ming ^{[1
]}

Zaidy, Aliasger ^{[1
]}

Culurciello, Eugenio ^{[1
]}

机构：

[1] FWDNXT, W Lafayette, IN 47906 USA

来源：

2018 1ST WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING FOR EMBEDDED APPLICATIONS (EMC2) | 2018年

关键词：

Deep learning; neural networks; co-processor; compiler;

D O I：

10.1109/EMC2.2018.00013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks (DNNs) are widely used in various applications including image classification, semantic segmentation and natural language processing. Various DNN models were developed to achieve high accuracy on different tasks. Efficiently mapping the workflow of those models onto custom accelerators requires a programmable hardware and a custom compiler. In this work, we use Snowflake, which is a programmable DNN targeted accelerator. We also present a compiler that correctly generated code for Snowflake. Our system were evaluated on various convolution layers present in AlexNet, ResNet and LightCNN. Snowflake with 256 processing units was implemented on Xilinx FPGA, and it achieved 70 frames/s for AlexNet without linear layers.

引用

页码：24 / 28

页数：5

共 50 条

[31] Compiler Technologies in Deep Learning Co-Design: A Survey
Zhang, Hongbin
Xing, Mingjie
Wu, Yanjun
Zhao, Chen
Intelligent Computing, 2023, 2
[32] Towards Efficient Elastic Parallelism for Deep Learning Processor
Cheng, Jinyu
Qian, Ruyi
Shi, Qinwen
Hu, Gaomei
Ciao, Mengjuan
Huo, Qirun
Xu, Yuanchao
2022 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING, ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM, 2022, : 363 - 370
[33] A Deep Learning Model for Source Code Generation
Tiwang, Raymond
Oladunni, Timothy
Xu, Weifeng
2019 IEEE SOUTHEASTCON, 2019,
[34] An efficient RTL-based code generation for specified DSP C-compiler
Pan, QH
Liu, P
Shi, C
Yao, QD
Zhu, SB
Yan, L
Zhou, Y
Huang, WB
MEDIA PROCESSORS 2002, 2002, 4674 : 141 - 149
[35] An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN Inference on Mobile Devices
Elbtity, Mohammed
Singh, Abhishek
Reidy, Brendan
Guo, Xiaochen
Zand, Ramtin
2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 188 - 193
[36] Identifying Compiler and Optimization Options from Binary Code using Deep Learning Approaches
Pizzolotto, Davide
Inoue, Katsuro
2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 232 - 242
[37] An Efficient Hardware Architecture for Activation Function in Deep Learning Processor
Li, Lin
Zhang, Shengbing
Wu, Juan
2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 911 - 918
[38] Generation of the Single Precision BLAS library for the Parallella platform, with Epiphany co-processor acceleration, using the BLIS framework
Tasende, Miguel
2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 894 - 897
[39] An Efficient OpenCL-Based Implementation of a SHA-3 Co-Processor on an FPGA-Centric Platform
Bensalem, Hachem
Blaquiere, Yves
Savaria, Yvon
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (03) : 1144 - 1148
[40] An FPGA-Based Co-Processor for Spiking Neural Networks with On-Chip STDP-Based Learning
Nguyen, Thao N. N.
Veeravalli, Bharadwaj
Fong, Xuanyao
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2157 - 2161

← 1 2 3 4 5 →