TileNET: Scalable Architecture for High-throughput Ternary Convolution Neural Networks using FPGAs

被引：4

作者：

Vikram, Sahu Sai ^{[1
]}

Pant, Vibha ^{[2
]}

Mody, Mihir ^{[3
]}

Purnaprajna, Madhura ^{[2
]}

机构：

[1] Amrita Univ, Dept Elect & Commun Engn, Bengaluru, India

[2] Amrita Univ, Dept Comp Sci Engn, Bengaluru, India

[3] Texas Instruments Inc, Bengaluru, India

来源：

2018 31ST INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2018 17TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID & ES) | 2018年

关键词：

D O I：

10.1109/VLSID.2018.113

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.

引用

页码：461 / 462

页数：2

共 50 条

[1] TileNET: Hardware accelerator for ternary Convolutional Neural Networks
Eetha, Sagar
Sruthi, P. K.
Pant, Vibha
Vikram, Sai
Mody, Mihir
Purnaprajna, Madhura
MICROPROCESSORS AND MICROSYSTEMS, 2021, 83
[2] Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA
Prost-Boucle, Adrien
Bourge, Alban
Petrot, Frederic
Alemdar, Hande
Caldwell, Nicholas
Leroy, Vincent
2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
[3] High Throughput Spatial Convolution Filters on FPGAs
Ioannou, Lenos
Al-Dujaili, Abdullah
Fahmy, Suhaib A.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (06) : 1392 - 1402
[4] A scalable system architecture for high-throughput turbo-decoders
Thul, MJ
Gilbert, F
Vogt, T
Kreiselmaier, G
Wehn, N
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 39 (1-2): : 63 - 77
[5] A high-throughput scalable BNN accelerator with fully pipelined architecture
Zhe Han
Jingfei Jiang
Jinwei Xu
Peng Zhang
Xiaoqiang Zhao
Dong Wen
Yong Dou
CCF Transactions on High Performance Computing, 2021, 3 : 17 - 30
[6] A high-throughput scalable BNN accelerator with fully pipelined architecture
Han, Zhe
Jiang, Jingfei
Xu, Jinwei
Zhang, Peng
Zhao, Xiaoqiang
Wen, Dong
Dou, Yong
CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (01) : 17 - 30
[7] A Scalable System Architecture for High-Throughput Turbo-Decoders
Michael J. Thul
Frank Gilbert
Timo Vogt
Gerd Kreiselmaier
Norbert Wehn
Journal of VLSI signal processing systems for signal, image and video technology, 2005, 39 : 63 - 77
[8] A scalable system architecture for high-throughput turbo-decoders
Thul, MJ
Gilbert, F
Vogt, T
Kreiselmaier, G
Wehn, N
2002 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, 2002, : 152 - 158
[9] A Scalable High-Throughput Pipeline Architecture for DNA Sequence Alignment
Ghosh, Surajeet
Mandal, Sriparna
Ray, Sanchita Saha
TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
[10] A fast and scalable architecture to run convolutional neural networks in low density FPGAs
Vestias, Mario P.
Duarte, Rui P.
de Sousa, Jose T.
Neto, Horacio C.
MICROPROCESSORS AND MICROSYSTEMS, 2020, 77

← 1 2 3 4 5 →