Efficient compiler code generation for Deep Learning Snowflake co-processor

被引:0
|
作者
Chang, Andre Xian Ming [1 ]
Zaidy, Aliasger [1 ]
Culurciello, Eugenio [1 ]
机构
[1] FWDNXT, W Lafayette, IN 47906 USA
关键词
Deep learning; neural networks; co-processor; compiler;
D O I
10.1109/EMC2.2018.00013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) are widely used in various applications including image classification, semantic segmentation and natural language processing. Various DNN models were developed to achieve high accuracy on different tasks. Efficiently mapping the workflow of those models onto custom accelerators requires a programmable hardware and a custom compiler. In this work, we use Snowflake, which is a programmable DNN targeted accelerator. We also present a compiler that correctly generated code for Snowflake. Our system were evaluated on various convolution layers present in AlexNet, ResNet and LightCNN. Snowflake with 256 processing units was implemented on Xilinx FPGA, and it achieved 70 frames/s for AlexNet without linear layers.
引用
收藏
页码:24 / 28
页数:5
相关论文
共 50 条
  • [21] Tail Biting Convolutional Code Decoder Co-processor for High Throughput System-on-Chip
    Ramdani, Ahmad Zaky
    Adiono, Trio
    2015 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2015, : 303 - 304
  • [22] Early experience on porting and running a Lattice Boltzmann code on the Xeon-Phi co-processor
    Crimi, G.
    Mantovani, F.
    Pivanti, M.
    Schifano, S. F.
    Tripiccione, R.
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 551 - 560
  • [23] Towards optimized tensor code generation for deep learning on sunway many-core processor
    Li, Mingzhen
    Liu, Changxi
    Liao, Jianjin
    Zheng, Xuegui
    Yang, Hailong
    Sun, Rujun
    Xu, Jun
    Gan, Lin
    Yang, Guangwen
    Luan, Zhongzhi
    Qian, Depei
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (02)
  • [24] Towards optimized tensor code generation for deep learning on sunway many-core processor
    Mingzhen Li
    Changxi Liu
    Jianjin Liao
    Xuegui Zheng
    Hailong Yang
    Rujun Sun
    Jun Xu
    Lin Gan
    Guangwen Yang
    Zhongzhi Luan
    Depei Qian
    Frontiers of Computer Science, 2024, 18
  • [25] Practical compiler techniques on efficient multithreaded code generation for OpenMP programs
    Tian, XM
    Girkar, M
    Bik, A
    Saito, H
    COMPUTER JOURNAL, 2005, 48 (05): : 588 - 601
  • [26] Efficient and flexible co-processor for server-based public key cryptography applications
    Laue R.
    Lecture Notes in Electrical Engineering, 2010, 78 : 129 - 149
  • [27] Deep learning for code generation: a survey
    Zhang, Huangzhao
    Zhang, Kechi
    Li, Zhuo
    Li, Jia
    Li, Yongmin
    Zhao, Yunfei
    Zhu, Yuqi
    Liu, Fang
    Li, Ge
    Jin, Zhi
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (09)
  • [28] Deep learning for code generation: a survey
    Huangzhao ZHANG
    Kechi ZHANG
    Zhuo LI
    Jia LI
    Jia LI
    Yongmin LI
    Yunfei ZHAO
    Yuqi ZHU
    Fang LIU
    Ge LI
    Zhi JIN
    Science China(Information Sciences), 2024, 67 (09) : 5 - 40
  • [29] Compiler-Based Graph Representations for Deep Learning Models of Code
    Brauckmann, Alexander
    Goens, Andres
    Ertel, Sebastian
    Castrillon, Jeronimo
    PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '20), 2020, : 201 - 211
  • [30] Reconfigurable Co-Processor Architecture with Limited Numerical Precision to Accelerate Deep Convolutional Neural Networks
    Wijeratne, Sasindu
    Jayaweera, Sandaruwan
    Dananjaya, Mahesh
    Pasqual, Ajith
    2018 IEEE 29TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2018, : 143 - 149