A Block-Floating-Point Arithmetic Based FPGA Accelerator for Convolutional Neural Networks

被引:0
|
作者
Zhang, Heshan [1 ]
Liu, Zhenyu [2 ]
Zhang, Guanwen [1 ]
Dai, Jiwu [1 ]
Lian, Xiaocong [3 ]
Zhou, Wei [1 ]
Ji, Xiangyang [3 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian, Peoples R China
[2] Tsinghua Univ, RIIT&TNList, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
CNN; FPGA; block-floating-point;
D O I
10.1109/globalsip45357.2019.8969292
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNNs) have been widely used in computer vision applications and achieved great success. However, large-scale CNN models usually consume a lot of computing and memory resources, which makes it difficult for them to be deployed on embedded devices. An efficient block-floating-point (BFP) arithmetic is proposed in this paper. compared with 32-bit floating-point arithmetic, the memory and off-chip bandwidth requirements during convolution are reduced by 50% and 72.37%, respectively. Due to the adoption of BFP arithmetic, the complex multiplication and addition operations of floating-point numbers can be replaced by the corresponding operations of fixed-point numbers, which is more efficient on hardware. A CNN model can be deployed on our accelerator with no more than 0.14% top-1 accuracy loss, and there is no need for retraining and fine-tuning. By employing a series of ping-pong memory access schemes, 2-dimensional propagate partial multiply-accumulate (PPMAC) processors, and an optimized memory system, we implemented a CNN accelerator on Xilinx VC709 evaluation board. The accelerator achieves a performance of 665.54 GOP/s and a power efficiency of 89.7 GOP/s/W under a 300 MHz working frequency, which outperforms previous FPGA based accelerators significantly.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic
    Lian, Xiaocong
    Liu, Zhenyu
    Song, Zhourui
    Dai, Jiwu
    Zhou, Wei
    Ji, Xiangyang
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) : 1874 - 1885
  • [2] A Hierarchical Block-Floating-Point Arithmetic
    Shiro Kobayashi
    Gerhard P. Fettweis
    Journal of VLSI signal processing systems for signal, image and video technology, 2000, 24 : 19 - 30
  • [3] A hierarchical block-floating-point arithmetic
    Kobayashi, S
    Fettweis, GP
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2000, 24 (01): : 19 - 30
  • [4] On roundoff errors in block-floating-point arithmetic
    Mitra, Abhijit
    IETE JOURNAL OF RESEARCH, 2006, 52 (01) : 45 - 51
  • [5] A new approach for block-floating-point arithmetic
    Kobayashi, S
    Fettweis, GP
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 2009 - 2012
  • [6] Static Block Floating-Point Quantization for Convolutional Neural Networks on FPGA
    Fan, Hongxiang
    Wang, Gang
    Ferianc, Martin
    Niu, Xinyu
    Luk, Wayne
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 28 - 35
  • [7] REALIZATION OF DIGITAL FILTERS USING BLOCK-FLOATING-POINT ARITHMETIC
    OPPENHEIM, AV
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1970, AU18 (02): : 130 - +
  • [8] SpCNA: An FPGA-based Accelerator for Point Cloud Convolutional Neural Networks
    Zhou, Gong-Lang
    Guo, Kaiyuan
    Chen, Xiang
    Leung, Kwok Wa
    2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 211 - 211
  • [9] Low Bitwidth CNN Accelerator on FPGA Using Winograd and Block Floating Point Arithmetic
    Wong, Yuk
    Dong, Zhenjiang
    Zhang, Wei
    2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 218 - 223
  • [10] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
    Yin, Xiaodi
    Wu, Zhipeng
    Li, Dejian
    Shen, Chongfei
    Liu, Yu
    IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161