Efficient compiler code generation for Deep Learning Snowflake co-processor

被引：0

作者：

Chang, Andre Xian Ming ^{[1
]}

Zaidy, Aliasger ^{[1
]}

Culurciello, Eugenio ^{[1
]}

机构：

[1] FWDNXT, W Lafayette, IN 47906 USA

来源：

2018 1ST WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING FOR EMBEDDED APPLICATIONS (EMC2) | 2018年

关键词：

Deep learning; neural networks; co-processor; compiler;

D O I：

10.1109/EMC2.2018.00013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks (DNNs) are widely used in various applications including image classification, semantic segmentation and natural language processing. Various DNN models were developed to achieve high accuracy on different tasks. Efficiently mapping the workflow of those models onto custom accelerators requires a programmable hardware and a custom compiler. In this work, we use Snowflake, which is a programmable DNN targeted accelerator. We also present a compiler that correctly generated code for Snowflake. Our system were evaluated on various convolution layers present in AlexNet, ResNet and LightCNN. Snowflake with 256 processing units was implemented on Xilinx FPGA, and it achieved 70 frames/s for AlexNet without linear layers.

引用

页码：24 / 28

页数：5

共 50 条

[21] Tail Biting Convolutional Code Decoder Co-processor for High Throughput System-on-Chip
Ramdani, Ahmad Zaky
Adiono, Trio
2015 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2015, : 303 - 304
[22] Early experience on porting and running a Lattice Boltzmann code on the Xeon-Phi co-processor
Crimi, G.
Mantovani, F.
Pivanti, M.
Schifano, S. F.
Tripiccione, R.
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 551 - 560
[23] Towards optimized tensor code generation for deep learning on sunway many-core processor
Li, Mingzhen
Liu, Changxi
Liao, Jianjin
Zheng, Xuegui
Yang, Hailong
Sun, Rujun
Xu, Jun
Gan, Lin
Yang, Guangwen
Luan, Zhongzhi
Qian, Depei
FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (02)
[24] Towards optimized tensor code generation for deep learning on sunway many-core processor
Mingzhen Li
Changxi Liu
Jianjin Liao
Xuegui Zheng
Hailong Yang
Rujun Sun
Jun Xu
Lin Gan
Guangwen Yang
Zhongzhi Luan
Depei Qian
Frontiers of Computer Science, 2024, 18
[25] Practical compiler techniques on efficient multithreaded code generation for OpenMP programs
Tian, XM
Girkar, M
Bik, A
Saito, H
COMPUTER JOURNAL, 2005, 48 (05): : 588 - 601
[26] Efficient and flexible co-processor for server-based public key cryptography applications
Laue R.
Lecture Notes in Electrical Engineering, 2010, 78 : 129 - 149
[27] Deep learning for code generation: a survey
Zhang, Huangzhao
Zhang, Kechi
Li, Zhuo
Li, Jia
Li, Yongmin
Zhao, Yunfei
Zhu, Yuqi
Liu, Fang
Li, Ge
Jin, Zhi
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (09)
[28] Deep learning for code generation: a survey
Huangzhao ZHANG
Kechi ZHANG
Zhuo LI
Jia LI
Jia LI
Yongmin LI
Yunfei ZHAO
Yuqi ZHU
Fang LIU
Ge LI
Zhi JIN
Science China(Information Sciences), 2024, 67 (09) : 5 - 40
[29] Compiler-Based Graph Representations for Deep Learning Models of Code
Brauckmann, Alexander
Goens, Andres
Ertel, Sebastian
Castrillon, Jeronimo
PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '20), 2020, : 201 - 211
[30] Reconfigurable Co-Processor Architecture with Limited Numerical Precision to Accelerate Deep Convolutional Neural Networks
Wijeratne, Sasindu
Jayaweera, Sandaruwan
Dananjaya, Mahesh
Pasqual, Ajith
2018 IEEE 29TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2018, : 143 - 149

← 1 2 3 4 5 →