A high-throughput scalable BNN accelerator with fully pipelined architecture

被引:0
|
作者
Zhe Han
Jingfei Jiang
Jinwei Xu
Peng Zhang
Xiaoqiang Zhao
Dong Wen
Yong Dou
机构
[1] National University of Defense Technology,
关键词
CNN; BNN; FPGA; Accelerator;
D O I
暂无
中图分类号
学科分类号
摘要
By replacing multiplication with XNOR operation, Binarized Neural Networks (BNN) are hardware-friendly and extremely suitable for FPGA acceleration. Previous researches highlighted the potential exploitation of BNNs performance. However, most of the present researches targeted at minimizing chip areas. They achieved excellent energy and resource efficiency in small FPGA while the results in larger FPGA were unsatisfying. Thus, we proposed a scalable fully pipelined BNN architecture, which targeted on maximizing throughput and keeping energy and resource efficiency in large FPGA. By exploiting multi-levels parallelism and balancing pipeline stages, it achieved excellent performance. Moreover, we shared on-chip memory and balanced the computation resources to further utilizing the resource. Then a methodology is proposed that explores design space for the optimal configuration. This work is evaluated based on Xilinx UltraScale XCKU115. The results show that the proposed architecture achieves 2.24×–11.24× performance and 2.43×–11.79× resource efficiency improvement compared with other BNN accelerators.
引用
收藏
页码:17 / 30
页数:13
相关论文
共 50 条
  • [11] High throughput, low cost, fully pipelined architecture for AES crypto chip
    Iyer, Nalini C.
    Anandmohan, P. V.
    Poornaiah, D. V.
    Kulkarni, V. D.
    [J]. 2006 ANNUAL IEEE INDIA CONFERENCE, 2006, : 340 - +
  • [12] A scalable system architecture for high-throughput turbo-decoders
    Thul, MJ
    Gilbert, F
    Vogt, T
    Kreiselmaier, G
    Wehn, N
    [J]. 2002 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, 2002, : 152 - 158
  • [13] A scalable system architecture for high-throughput turbo-decoders
    Thul, MJ
    Gilbert, F
    Vogt, T
    Kreiselmaier, G
    Wehn, N
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 39 (1-2): : 63 - 77
  • [14] A Scalable System Architecture for High-Throughput Turbo-Decoders
    Michael J. Thul
    Frank Gilbert
    Timo Vogt
    Gerd Kreiselmaier
    Norbert Wehn
    [J]. Journal of VLSI signal processing systems for signal, image and video technology, 2005, 39 : 63 - 77
  • [15] A Scalable High-Throughput Pipeline Architecture for DNA Sequence Alignment
    Ghosh, Surajeet
    Mandal, Sriparna
    Ray, Sanchita Saha
    [J]. TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [16] A High Throughput Fully Parallel-Pipelined FPGA Accelerator for Dense Cloud Motion Analysis
    Johnson, Bibin
    Rani, Sheeba J.
    [J]. PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 2589 - 2592
  • [17] A Scalable High-Precision and High-Throughput Architecture for Emulation of Quantum Algorithms
    Mahmud, Naveed
    El-Araby, Esam
    [J]. 2018 31ST IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2018, : 49 - 54
  • [18] A scalable architecture for high-throughput regular-expression pattern matching
    Brodie, Benjamin C.
    Cytron, Ron K.
    Taylor, David E.
    [J]. 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHTIECTURE, PROCEEDINGS, 2006, : 191 - 202
  • [19] Scalable high-throughput variable block size motion estimation architecture
    Warrington, Stephen
    Chan, Wai-Yip
    Sudharsanan, Subramania
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2009, 33 (04) : 319 - 325
  • [20] A compact pipelined architecture with high-throughput for context-based binary arithmetic coding
    Yu, Chu
    Hu, Hwai-Tsu
    [J]. 20TH ANNIVERSARY IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2007, : 33 - 36