FFConv: An FPGA-based Accelerator for Fast Convolution Layers in Convolutional Neural Networks

被引：16

作者：

Ahmad, Afzal ^{[1
]}

Pasha, Muhammad Adeel ^{[1
]}

机构：

[1] Lahore Univ Management Sci LUMS, Dept Elect Engn, Lahore, Pakistan

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2020年 / 19卷 / 02期

关键词：

FPGA; convolutional neural networks; hardware acceleration;

D O I：

10.1145/3380548

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image classification is known to be one of the most challenging problems in the domain of computer vision. Significant research is being done on developing systems and algorithms improving accuracy, performance, area, and power consumption for related problems. Convolutional Neural Networks (CNNs) have shown to give outstanding accuracies for problems such as image classification, object detection, and semantic segmentation. While CNNs are pioneering the development of high accuracy systems, their excessive computational complexity presents a barrier for a more permeated deployment. Although Graphical Processing Units (GPUs), due to theirmassively parallel architecture, have shown to give performance orders of magnitude better than general purpose processors, the former are limited by their high power consumption and generality. Consequently, Field Programmable Gate Arrays (FPGAs) are being explored to implement CNN architectures, as they also provide massively parallel logic resources but with a relatively lower power consumption than GPUs. In this article, we present FFConv, an efficient FPGA-based fast convolutional layer accelerator for CNNs. We design a pipelined, high-throughput convolution engine based on the Winograd minimal filtering (also called Fast Convolution) algorithms for computing the convolutional layers of three popular CNN architectures: VGG16, Alexnet, and Shufflenet. We implement our accelerator on a Virtex-7 FPGA platform where we exploit the computational parallelization to the maximum while exploring optimizations aimed at improving performance. The resultant design loses only 0.43%, 0.47%, and 0.61% Top-1 classification accuracy for VGG16, Alexnet, and Shufflenet-v1, respectively, while significantly improving throughput, resource, and power efficiency compared to previous state-of-the-art designs.

引用

页数：24

共 50 条

[1] FPGA-based Accelerator for Losslessly Quantized Convolutional Neural Networks
Sit, Mankit
Kazami, Ryosuke
Amano, Hideharu
[J]. 2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 295 - 298
[2] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
Zhou, Yongmei
Jiang, Jingfei
[J]. PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
[3] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
HuanZhang
YuanYang
YangXiao
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
[4] A FPGA-based Hardware Accelerator for Multiple Convolutional Neural Networks
Yao, Yuchen
Duan, Qinghua
Zhang, Zhiqian
Gao, Jiabao
Wang, Jian
Yang, Meng
Tao, Xinxuan
Lai, Jinmei
[J]. 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1075 - 1077
[5] Optimization of Energy Efficiency for FPGA-Based Convolutional Neural Networks Accelerator
Tang, Yongming
Dai, Rongshi
Xie, Yi
[J]. 2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487
[6] SpCNA: An FPGA-based Accelerator for Point Cloud Convolutional Neural Networks
Zhou, Gong-Lang
Guo, Kaiyuan
Chen, Xiang
Leung, Kwok Wa
[J]. 2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 211 - 211
[7] FPGA-based Accelerator for Deep Convolutional Neural Networks for the SPARK Environment
Morcel, Raghid
Ezzeddine, Mazen
Akkary, Haitham
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2016, : 126 - 133
[8] A Fast and Flexible FPGA-based Accelerator for Natural Language Processing Neural Networks
Suyeon, Hur
Na, Seongmin
Kwon, Dongup
Joonsung, Kim
Boutros, Andrew
Nurvitadhi, Eriko
Kim, Jangwoo
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 20 (01)
[9] A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks
Li, Huimin
Fan, Xitian
Jiao, Li
Cao, Wei
Zhou, Xuegong
Wang, Lingli
[J]. 2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
[10] Optimizing FPGA-based Convolutional Neural Networks Accelerator for Image Super-Resolution
Chang, Jung-Woo
Kang, Suk-Ju
[J]. 2018 23RD ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2018, : 343 - 348

← 1 2 3 4 5 →