A high-throughput scalable BNN accelerator with fully pipelined architecture

被引:0
|
作者
Zhe Han
Jingfei Jiang
Jinwei Xu
Peng Zhang
Xiaoqiang Zhao
Dong Wen
Yong Dou
机构
[1] National University of Defense Technology,
关键词
CNN; BNN; FPGA; Accelerator;
D O I
暂无
中图分类号
学科分类号
摘要
By replacing multiplication with XNOR operation, Binarized Neural Networks (BNN) are hardware-friendly and extremely suitable for FPGA acceleration. Previous researches highlighted the potential exploitation of BNNs performance. However, most of the present researches targeted at minimizing chip areas. They achieved excellent energy and resource efficiency in small FPGA while the results in larger FPGA were unsatisfying. Thus, we proposed a scalable fully pipelined BNN architecture, which targeted on maximizing throughput and keeping energy and resource efficiency in large FPGA. By exploiting multi-levels parallelism and balancing pipeline stages, it achieved excellent performance. Moreover, we shared on-chip memory and balanced the computation resources to further utilizing the resource. Then a methodology is proposed that explores design space for the optimal configuration. This work is evaluated based on Xilinx UltraScale XCKU115. The results show that the proposed architecture achieves 2.24×–11.24× performance and 2.43×–11.79× resource efficiency improvement compared with other BNN accelerators.
引用
收藏
页码:17 / 30
页数:13
相关论文
共 50 条
  • [21] A HIGH-THROUGHPUT NEURAL NETWORK ACCELERATOR
    Chen, Tianshi
    Du, Zidong
    Sun, Ninghui
    Wang, Jia
    Wu, Chengyong
    Chen, Yunji
    Temam, Olivier
    [J]. IEEE MICRO, 2015, 35 (03) : 24 - 32
  • [22] Matrix Multiplication based on Scalable Macro-Pipelined FPGA Accelerator Architecture
    Jiang, Jiang
    Mirian, Vincent
    Tang, Kam Pui
    Chow, Paul
    Xing, Zuocheng
    [J]. 2009 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS, 2009, : 48 - +
  • [23] A scalable high-throughput chemical synthesizer
    Livesay, EA
    Liu, YH
    Luebke, KJ
    Irick, J
    Belosludtsev, Y
    Rayner, S
    Balog, R
    Johnston, SA
    [J]. GENOME RESEARCH, 2002, 12 (12) : 1950 - 1960
  • [24] A High-Throughput ECC Architecture
    Amini, Esmaeil
    Jeddi, Zahra
    Bayoumi, Magdy
    [J]. 2012 19TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2012, : 901 - 904
  • [25] High-throughput turbo decoder using pipelined parallel architecture and collision-free interleaver
    Karim, S. M.
    Chakrabarti, I.
    [J]. IET COMMUNICATIONS, 2012, 6 (11) : 1416 - 1424
  • [26] A high-throughput pipelined CMA equalizer with minimum latency
    Mizuno, W
    Ueda, K
    Okello, J
    Ochi, H
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4184 - 4184
  • [27] High-throughput and area-efficient fully-pipelined hashing cores using BRAM in FPGA
    Li, Lin
    Lin, Shaoyu
    Shen, Shuli
    Wu, Kongcheng
    Li, Xiaochao
    Chen, Yihui
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2019, 67 : 82 - 92
  • [28] Scalable Fully Pipelined Hardware Architecture for In-Network Aggregated AllReduce Communication
    Liu, Yao
    Zhang, Junyi
    Liu, Shuo
    Wang, Qiaoling
    Dai, Wangchen
    Cheung, Ray Chak Chung
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (10) : 4194 - 4206
  • [29] A pipelined memory architecture for high throughput network processors
    Sherwood, T
    Varghese, G
    Calder, B
    [J]. 30TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2003, : 288 - 299
  • [30] MetaZip: A High-throughput and Efficient Accelerator for DEFLATE
    Gao, Ruihao
    Li, Xueqi
    Li, Yewen
    Wang, Xun
    Tan, Guangming
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 319 - 324