TileNET: Scalable Architecture for High-throughput Ternary Convolution Neural Networks using FPGAs

被引:4
|
作者
Vikram, Sahu Sai [1 ]
Pant, Vibha [2 ]
Mody, Mihir [3 ]
Purnaprajna, Madhura [2 ]
机构
[1] Amrita Univ, Dept Elect & Commun Engn, Bengaluru, India
[2] Amrita Univ, Dept Comp Sci Engn, Bengaluru, India
[3] Texas Instruments Inc, Bengaluru, India
关键词
D O I
10.1109/VLSID.2018.113
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.
引用
收藏
页码:461 / 462
页数:2
相关论文
共 50 条
  • [1] TileNET: Hardware accelerator for ternary Convolutional Neural Networks
    Eetha, Sagar
    Sruthi, P. K.
    Pant, Vibha
    Vikram, Sai
    Mody, Mihir
    Purnaprajna, Madhura
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 83
  • [2] Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA
    Prost-Boucle, Adrien
    Bourge, Alban
    Petrot, Frederic
    Alemdar, Hande
    Caldwell, Nicholas
    Leroy, Vincent
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [3] High Throughput Spatial Convolution Filters on FPGAs
    Ioannou, Lenos
    Al-Dujaili, Abdullah
    Fahmy, Suhaib A.
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (06) : 1392 - 1402
  • [4] A scalable system architecture for high-throughput turbo-decoders
    Thul, MJ
    Gilbert, F
    Vogt, T
    Kreiselmaier, G
    Wehn, N
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 39 (1-2): : 63 - 77
  • [5] A high-throughput scalable BNN accelerator with fully pipelined architecture
    Zhe Han
    Jingfei Jiang
    Jinwei Xu
    Peng Zhang
    Xiaoqiang Zhao
    Dong Wen
    Yong Dou
    CCF Transactions on High Performance Computing, 2021, 3 : 17 - 30
  • [6] A high-throughput scalable BNN accelerator with fully pipelined architecture
    Han, Zhe
    Jiang, Jingfei
    Xu, Jinwei
    Zhang, Peng
    Zhao, Xiaoqiang
    Wen, Dong
    Dou, Yong
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (01) : 17 - 30
  • [7] A Scalable System Architecture for High-Throughput Turbo-Decoders
    Michael J. Thul
    Frank Gilbert
    Timo Vogt
    Gerd Kreiselmaier
    Norbert Wehn
    Journal of VLSI signal processing systems for signal, image and video technology, 2005, 39 : 63 - 77
  • [8] A scalable system architecture for high-throughput turbo-decoders
    Thul, MJ
    Gilbert, F
    Vogt, T
    Kreiselmaier, G
    Wehn, N
    2002 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, 2002, : 152 - 158
  • [9] A Scalable High-Throughput Pipeline Architecture for DNA Sequence Alignment
    Ghosh, Surajeet
    Mandal, Sriparna
    Ray, Sanchita Saha
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [10] A fast and scalable architecture to run convolutional neural networks in low density FPGAs
    Vestias, Mario P.
    Duarte, Rui P.
    de Sousa, Jose T.
    Neto, Horacio C.
    MICROPROCESSORS AND MICROSYSTEMS, 2020, 77