An FPGA-Based Transformer Accelerator Using Output Block Stationary Dataflow for Object Recognition Applications

被引:6
|
作者
Zhao, Zhongyu [1 ,2 ]
Cao, Rujian [1 ,2 ]
Un, Ka-Fai [1 ,2 ]
Yu, Wei-Han [1 ,2 ]
Mak, Pui-In [1 ,2 ]
Martins, Rui P. [1 ,2 ,3 ]
机构
[1] Univ Macau, State Key Lab Analog & Mixed Signal VLSI, Inst Microelect, Macau, Peoples R China
[2] Univ Macau, Fac Sci & Technol ECE, Macau, Peoples R China
[3] Univ Lisbon, Inst Super Tecn, P-1649004 Lisbon, Portugal
关键词
Transformers; Energy efficiency; Broadcasting; Convolutional neural networks; Integrated circuit modeling; Field programmable gate arrays; Random access memory; Dataflow; digital accelerator; energy-efficient; field-programmable gate array (FPGA); energy efficiency; image recognition; transformer; CNN ACCELERATOR; EFFICIENT; HARDWARE;
D O I
10.1109/TCSII.2022.3196055
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The transformer-based model has great potential to deliver higher accuracy for object recognition applications when comparing it with the convolution neural network (CNN). Yet, the amount of weight sharing of a transformer-based model is significantly lower than that of the CNN, which should apply different dataflow to reduce the memory access. This brief proposes a transformer accelerator with an output block stationary (OBS) dataflow to minimize the repeated memory access by block-level and vector-level broadcasting while preserving a high digital signal processor (DSP) utilization rate, leading to higher energy efficiency. It also lowers the memory access bandwidth to the input and output. Verified through an FPGA, the proposed accelerator evaluates a transformer-in-transformer (TNT) model with a throughput of 728.3 GOPs, corresponding to energy efficiency of 58.31 GOPs/W.
引用
收藏
页码:281 / 285
页数:5
相关论文
共 50 条
  • [1] An FPGA-based accelerator for Fourier Descriptors computing for color object recognition using SVM
    Smach, Fethi
    Miteran, Johel
    Atri, Mohamed
    Dubois, Julien
    Abid, Mohamed
    Gauthier, Jean-Paul
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2007, 2 (04) : 249 - 258
  • [2] An FPGA-based accelerator for Fourier Descriptors computing for color object recognition using SVM
    Fethi Smach
    Johel Miteran
    Mohamed Atri
    Julien Dubois
    Mohamed Abid
    Jean-Paul Gauthier
    [J]. Journal of Real-Time Image Processing, 2007, 2 : 249 - 258
  • [3] An FPGA-Based Energy-Efficient Reconfigurable Convolutional Neural Network Accelerator for Object Recognition Applications
    Li, Jixuan
    Un, Ka-Fai
    Yu, Wei-Han
    Mak, Pui-In
    Martins, Rui P.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (09) : 3143 - 3147
  • [4] An FPGA-based Accelerator for Cortical Object Classification
    Park, Mi Sun
    Kestur, Srinidhi
    Sabarad, Jagdish
    Narayanan, Vijaykrishnan
    Irwin, Mary Jane
    [J]. DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 691 - 696
  • [5] Dataflow object detection system for FPGA-based smart camera
    Bourrasset, Cedric
    Maggiani, Luca
    Serot, Jocelyn
    Berry, Francois
    [J]. IET CIRCUITS DEVICES & SYSTEMS, 2016, 10 (04) : 280 - 291
  • [6] FPGA-based accelerator for object detection: a comprehensive survey
    Kai Zeng
    Qian Ma
    Jia Wen Wu
    Zhe Chen
    Tao Shen
    Chenggang Yan
    [J]. The Journal of Supercomputing, 2022, 78 : 14096 - 14136
  • [7] FPGA-based accelerator for object detection: a comprehensive survey
    Zeng, Kai
    Ma, Qian
    Wu, Jia Wen
    Chen, Zhe
    Shen, Tao
    Yan, Chenggang
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (12): : 14096 - 14136
  • [8] An FPGA-Based High-Throughput Dataflow Accelerator for Lightweight Neural Network
    Zhao, Zhiyuan
    Li, Jixing
    Chen, Gang
    Jiang, Zhelong
    Qiao, Ruixiu
    Xu, Peng
    Chen, Yihao
    Lu, Huaxiang
    [J]. 2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [9] An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications
    Cao, Rujian
    Zhao, Zhongyu
    Un, Ka-Fai
    Yu, Wei-Han
    Martins, Rui P.
    Mak, Pui-In
    [J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2024, 71 (11) : 4688 - 4692
  • [10] FPGA-based Deep Learning Accelerator for RF Applications
    den Boer, H.
    Muller, R. W. D.
    Wong, S.
    Voogt, V.
    [J]. 2021 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2021), 2021,