Fast and Efficient Implementation of Convolutional Neural Networks on FPGA

被引：0

作者：

Podili, Abhinav ^{[1
]}

Zhang, Chi ^{[1
]}

Prasanna, Viktor ^{[1
]}

机构：

[1] Univ Southern Calif, Ming Hsieh Dept Elect Engn, Los Angeles, CA 90089 USA

来源：

2017 IEEE 28TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP) | 2017年

关键词：

Convolutional neural networks; Winograd minimal filtering algorithm; Efficient; Double buffering; Data reuse; Pipelining;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

State-of-the-art CNN models for Image recognition use deep networks with small filters instead of shallow networks with large filters, because the former requires fewer weights. In the light of above trend, we present a fast and efficient FPGA based convolution engine to accelerate CNN models over small filters. The convolution engine implements Winograd minimal filtering algorithm to reduce the number of multiplications by 38% to 55% for state-of-the-art CNNs. We exploit the parallelism of the Winograd convolution engine to scale the overall performance. We show that our overall design sustains the peak throughput of the convolution engines. We propose a novel data layout to reduce the required memory bandwidth of our design by half. One noteworthy feature of our Winograd convolution engine is that it hides the computation latency of the pooling layer. As a case study we implement VGG16 CNN model and compare it with previous approaches. Compared with the state-of-the-art reduced precision VGG16 implementation, our implementation achieves 1.2x improvement in throughput by using 3x less multipliers and 2x less on-chip memory without impacting the classification accuracy. The improvements in throughput per multiplier and throughput per unit on-chip memory are 3.7x and 2.47x respectively, compared with the state-of-the-art design.

引用

页码：11 / 18

页数：8

共 50 条

[1] Efficient Implementation of Convolutional Neural Networks on FPGA
Hadnagy, A.
Feher, B.
Kovacshazy, T.
[J]. 2018 19TH INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2018, : 359 - 364
[2] Implementation of energy-efficient fast convolution algorithm for deep convolutional neural networks based on FPGA
Li, W. -J.
Ruan, S. -J.
Yang, D. -S.
[J]. ELECTRONICS LETTERS, 2020, 56 (10) : 485 - 487
[3] FPGA Implementation and Acceleration of Convolutional Neural Networks
Pisharody, Jayanth N.
Pranav, K. B.
Ranjitha, M.
Rajeshwari, B.
[J]. 2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
[4] Noise Convolutional Neural Networks and FPGA Implementation
Munakala, Atsuki
Nakahara, IIiroki
Sato, Shimpei
[J]. 2019 IEEE 49TH INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC (ISMVL), 2019, : 85 - 90
[5] Acceleration and implementation of convolutional neural networks based on FPGA
Zhao, Sijie
Gao, Shangshang
Wang, Rugang
Wang, Yuanyuan
Zhou, Feng
Guo, Naihong
[J]. DIGITAL SIGNAL PROCESSING, 2023, 141
[6] Customizing Neural Networks for Efficient FPGA Implementation
Samragh, Mohammad
Ghasemzadeh, Mohammad
Koushanfar, Farinaz
[J]. 2017 IEEE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2017), 2017, : 85 - 92
[7] Efficient Utilization of FPGA Multipliers for Convolutional Neural Networks
Boulasikis, M. A.
Birbas, M.
Tsafas, N.
Kanakaris, N.
[J]. 2021 10TH INTERNATIONAL CONFERENCE ON MODERN CIRCUITS AND SYSTEMS TECHNOLOGIES (MOCAST), 2021,
[8] Efficient Design of Pruned Convolutional Neural Networks on FPGA
Vestias, Mario
[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (05): : 531 - 544
[9] Efficient Design of Pruned Convolutional Neural Networks on FPGA
Mário Véstias
[J]. Journal of Signal Processing Systems, 2021, 93 : 531 - 544
[10] Efficient FPGA Implementation of Local Binary Convolutional Neural Network
Zhakatayev, Aidyn
Lee, Jongeun
[J]. 24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, : 699 - 704

← 1 2 3 4 5 →