An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications

被引:0
|
作者
Cao, Rujian [1 ,2 ]
Zhao, Zhongyu [1 ,2 ]
Un, Ka-Fai [1 ,2 ]
Yu, Wei-Han [1 ,2 ]
Martins, Rui P. [1 ,2 ,3 ]
Mak, Pui-In [1 ,2 ]
机构
[1] Univ Macau, Inst Microelect, State Key Lab Analog & Mixed Signal VLSI, Macau, Peoples R China
[2] Univ Macau, Fac Sci & Technol, ECE, Macau, Peoples R China
[3] Univ Lisbon, Inst Super Tecn, P-1049001 Lisbon, Portugal
关键词
Sparse matrices; Computational modeling; Transformers; Hardware; Energy efficiency; Circuits; Throughput; Dataflow; digital accelerator; energy-efficient; field-programmable gate array (FPGA); sparsity; transformer; EFFICIENT;
D O I
10.1109/TCSII.2024.3462560
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Dataflow management provides limited performance improvement to the transformer model due to its lesser weight reuse than the convolution neural network. The cosFormer reduced computational complexity while achieving comparable performance to the vanilla transformer for natural language processing tasks. However, the unstructured sparsity in the cosFormer makes it a challenge to be implemented efficiently. This brief proposes a parallel unstructured sparsity handling (PUSH) scheme to compute sparse-dense matrix multiplication (SDMM) efficiently. It transforms unstructured sparsity into structured sparsity and reduces the total memory access by balancing the memory accesses of the sparse and dense matrices in the SDMM. We also employ unstructured weight pruning cooperating with PUSH to further increase the structured sparsity of the model. Through verification on an FPGA platform, the proposed accelerator achieves a throughput of 2.82 TOPS and an energy efficiency of 144.8 GOPs/W for HotpotQA dataset with long sequences.
引用
收藏
页码:4688 / 4692
页数:5
相关论文
共 31 条
  • [1] FPGA-based Garbling Accelerator with Parallel Pipeline Processing
    Oishi, Rin
    Kadomoto, Junichiro
    Irie, Hidetsugu
    Sakai, Shuichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (12) : 1988 - 1996
  • [2] Energy Efficient FPGA-Based Accelerator for Dynamic Sparse Transformer
    Li, Zuohao
    Lai, Yiwan
    Zhang, Hao
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 7 - 12
  • [3] An FPGA-Based Transformer Accelerator Using Output Block Stationary Dataflow for Object Recognition Applications
    Zhao, Zhongyu
    Cao, Rujian
    Un, Ka-Fai
    Yu, Wei-Han
    Mak, Pui-In
    Martins, Rui P.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (01) : 281 - 285
  • [4] FPGA-based Deep Learning Accelerator for RF Applications
    den Boer, H.
    Muller, R. W. D.
    Wong, S.
    Voogt, V.
    2021 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2021), 2021,
  • [5] Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices
    Du, Congpeng
    Ko, Seok-Bum
    Zhang, Hao
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [6] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
    Li T.
    Zhang F.
    Wang S.
    Cao W.
    Chen L.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672
  • [7] An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
    Shao, Haikuo
    Shil, Huihong
    Mao, Wendong
    Wang, Zhongfeng
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [8] Efficient FPGA-Based Transformer Accelerator Using In-Block Balanced Pruning
    Wang, Saiqun
    Zhang, Hao
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 18 - 23
  • [9] Calculation Optimization for Convolutional Neural Networks and FPGA-based Accelerator Design Using the Parameters Sparsity
    Liu Qinrang
    Liu Chongyang
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (06) : 1368 - 1374
  • [10] FPGA-based hardware accelerator of the heat equation with applications on infrared thermography
    Pardo, F.
    Lopez, P.
    Cabello, D.
    2008 INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2008, : 179 - +