Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA

被引:53
|
作者
Wang, Junsong [1 ]
Lou, Qiuwen [2 ]
Zhang, Xiaofan [3 ]
Zhu, Chao [1 ]
Lin, Yonghua [1 ]
Chen, Deming [3 ]
机构
[1] IBM Res China, Beijing, Peoples R China
[2] Univ Notre Dame, Notre Dame, IN 46556 USA
[3] Univ Illinois, Champaign, IL USA
关键词
D O I
10.1109/FPL.2018.00035
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Neural network accelerators with low latency and low energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow for accelerating the extremely low bit-width neural network (ELB-NN) in embedded FPGAs with hybrid quantization schemes. This flow covers both network training and FPGA-based network deployment, which facilitates the design space exploration and simplifies the tradeoff between network accuracy and computation efficiency. Using this flow helps hardware designers to deliver a network accelerator in edge devices under strict resource and power constraints. We present the proposed flow by supporting hybrid ELB settings within a neural network. Results show that our design can deliver very high performance peaking at 103 TOPS and classify up to 325.3 image/s/watt while running large-scale neural networks for less than 5W using embedded FPGA. To the best of our knowledge, it is the most energy efficient solution in comparison to GPU or other FPGA implementations reported so far in the literature.
引用
收藏
页码:163 / 169
页数:7
相关论文
共 41 条
  • [1] Accelerating Low Bit-Width Convolutional Neural Networks With Embedded FPGA
    Jiao, Li
    Luo, Cheng
    Cao, Wei
    Zhou, Xuegong
    Wang, Lingli
    [J]. 2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [2] Accelerating Low Bit-Width Deep Convolution Neural Network in MRAM
    He, Zhezhi
    Angizi, Shaahin
    Fan, Deliang
    [J]. 2018 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI), 2018, : 533 - 538
  • [3] Accelerating Low Bit-width Neural Networks at the Edge, PIM or FPGA: A Comparative Study
    Kochar, Nakul
    Ekiert, Lucas
    Najafi, Deniz
    Fan, Deliang
    Angizi, Shaahin
    [J]. PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2023, GLSVLSI 2023, 2023, : 625 - 630
  • [4] Low Bit-Width Convolutional Neural Network on RRAM
    Cai, Yi
    Tang, Tianqi
    Xia, Lixue
    Li, Boxun
    Wang, Yu
    Yang, Huazhong
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (07) : 1414 - 1427
  • [5] Bit-width Adaptive Accelerator Design for Convolution Neural Network
    Guo, Jianxin
    Yin, Shouyi
    Ouyang, Peng
    Tu, Fengbin
    Tang, Shibin
    Liu, Leibo
    Wei, Shaojun
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [6] Residual Quantization for Low Bit-Width Neural Networks
    Li, Zefan
    Ni, Bingbing
    Yang, Xiaokang
    Zhang, Wenjun
    Gao, Wen
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 214 - 227
  • [7] Combinatorial optimization for low bit-width neural networks
    Zhou, Han
    Ashrafi, Aida
    Blaschko, Matthew B.
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2246 - 2252
  • [8] Bit-width Reduction and Customized Register for Low Cost Convolutional Neural Network Accelerator
    Choi, Kyungrak
    Choi, Woong
    Shin, Kyungho
    Park, Jongsun
    [J]. 2017 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2017,
  • [9] An Efficient Streaming Accelerator for Low Bit-Width Convolutional Neural Networks
    Chen, Qinyu
    Fu, Yuxiang
    Song, Wenqing
    Cheng, Kaifeng
    Lu, Zhonghai
    Zhang, Chuan
    Li, Li
    [J]. ELECTRONICS, 2019, 8 (04)
  • [10] FlexBNN: Fast Private Binary Neural Network Inference With Flexible Bit-Width
    Dong, Ye
    Chen, Xiaojun
    Song, Xiangfu
    Li, Kaiyun
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 2382 - 2397