A Pipelined and Scalable Dataflow Implementation of Convolutional Neural Networks on FPGA

被引:11
|
作者
Bacis, Marco [1 ]
Natale, Giuseppe [1 ]
Del Sozzo, Emanuele [1 ]
Santambrogio, Marco Domenico [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn, Milan, Italy
关键词
Field Programmable Gate Arrays; Convolutional Neural Networks; Dataflow Architectures; COPROCESSOR; PERFORMANCE;
D O I
10.1109/IPDPSW.2017.44
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Network (CNN) is a deep learning algorithm extended from Artificial Neural Network (ANN) and widely used for image classification and recognition, thanks to its invariance to distortions. The recent rapid growth of applications based on deep learning algorithms, especially in the context of Big Data analytics, has dramatically improved both industrial and academic research and exploration of optimized implementations of CNNs on accelerators such as GPUs, FPGAs and ASICs, as general purpose processors can hardly meet the ever increasing performance and energy-efficiency requirements. FPGAs in particular are one of the most attractive alternative, as they allow the exploitation of the implicit parallelism of the algorithm and the acceleration of the different layers of a CNN with custom optimizations, while retaining extreme flexibility thanks to their reconfigurability. In this work, we propose a methodology to implement CNNs on FPGAs in a modular, scalable way. This is done by exploiting the dataflow pattern of convolutions, using an approach derived from previous work on the acceleration of Iterative Stencil Loops (ISLs), a computational pattern that shares some characteristics with convolutions. Furthermore, this approach allows the implementation of a high-level pipeline between the different network layers, resulting in an increase of the overall performance when the CNN is employed to process batches of multiple images, as it would happen in real-life scenarios.
引用
下载
收藏
页码:90 / 97
页数:8
相关论文
共 50 条
  • [21] An Efficient Dataflow Mapping Method for Convolutional Neural Networks
    Liu, Zhuangzhuang
    Gu, Huaxi
    Zhang, Bowen
    Shi, Canran
    NEURAL PROCESSING LETTERS, 2022, 54 (02) : 1075 - 1090
  • [22] A scalable and pipelined FPGA implementation of an OC192WF scheduler
    Merhebi, A
    Mohamed, OA
    2004 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2004, : 395 - 398
  • [23] A Multistage Dataflow Implementation of a Deep Convolutional Neural Network Based on FPGA For High-Speed Object Recognition
    Li, Ning
    Takaki, Shunpei
    Tomioka, Yoichi
    Kitazawa, Hitoshi
    2016 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI), 2016, : 165 - 168
  • [24] A scalable FPGA implementation of cellular neural networks for Gabor-type filtering
    Cheung, Ocean Y. H.
    Leong, Philip H. W.
    Tsang, Eric K. C.
    Shi, Bertram E.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 15 - 20
  • [25] FPGA Implementation of an Ultrasonic Flaw Detection Algorithm Based on Convolutional Neural Networks
    Yuan, Y.
    Virupakshappa, K.
    Oruklu, E.
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2022, 94 (12): : 1447 - 1457
  • [26] FPGA Implementation of an Ultrasonic Flaw Detection Algorithm Based on Convolutional Neural Networks
    Y. Yuan
    K. Virupakshappa
    E. Oruklu
    Journal of Signal Processing Systems, 2022, 94 : 1447 - 1457
  • [27] Accelerating Sparse Convolutional Neural Networks Based on Dataflow Architecture
    Wu, Xinxin
    Li, Yi
    Ou, Yan
    Li, Wenming
    Sun, Shibo
    Xu, Wenxing
    Fan, Dongrui
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 14 - 31
  • [28] FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks
    Lu, Wenyan
    Yan, Guihai
    Li, Jiajun
    Gong, Shijun
    Han, Yinhe
    Li, Xiaowei
    2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 553 - 564
  • [29] Scalable Convolutional Neural Networks for Decoding of Terminated Convolutional Codes
    Teich, Werner G.
    Pan, Weikun
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT I, 2023, 14134 : 43 - 54
  • [30] Stereoscopic scalable quantum convolutional neural networks
    Baek, Hankyul
    Yun, Won Joon
    Park, Soohyun
    Kim, Joongheon
    NEURAL NETWORKS, 2023, 165 : 860 - 867