Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices

被引:0
|
作者
Du, Congpeng [1 ]
Ko, Seok-Bum [2 ]
Zhang, Hao [1 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Qingdao, Peoples R China
[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada
关键词
D O I
10.1109/ISCAS58744.2024.10558631
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning methods in many fields of applications, including edge computing. However, transformer models have even more amount of computations and parameters than convolutional neural networks which makes them challenging to be deployed at resource-constrained edge devices. To tackle this problem, in this paper, an efficient FPGA-based binary transformer accelerator is proposed. Within the proposed architecture, an energy efficient matrix multiplication decomposition method is proposed to reduce the amount of computation. Moreover, an efficient binarized Softmax computation method is also proposed to reduce the memory footprint during Softmax computation. The proposed architecture is implemented on Xilinx Zynq Untrascale+ device and implementation results show that the proposed matrix multiplication decomposition method can reduce up to 78% of computation at runtime. The proposed transformer accelerator can achieve improved throughput and energy efficiency compared to previous transformer accelerator designs.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] SparkNoC: An energy-efficiency FPGA-based accelerator using optimized lightweight CNN for edge computing
    Xia, Ming
    Huang, Zunkai
    Tian, Li
    Wang, Hui
    Chang, Victor
    Zhu, Yongxin
    Feng, Songlin
    JOURNAL OF SYSTEMS ARCHITECTURE, 2021, 115
  • [22] Design an Efficient FPGA-Based Accelerator for Leveled BFV Homomorphic Encryption
    Kong, Liang
    Qin, Guojie
    Li, Shuguo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (03) : 1381 - 1385
  • [23] An Efficient FPGA-Based Accelerator for Perceptual Weighting Filter in Speech Coding
    Singh, Dilip
    Chandel, Rajeevan
    IETE TECHNICAL REVIEW, 2024, 41 (04) : 441 - 453
  • [24] An FPGA-Based accelerator for multiphysics modeling
    Huang, XM
    Ma, J
    ERSA '04: THE 2004 INTERNATIONAL CONFERENCE ON ENGINEERING OF RECONFIGURABLE SYSTEMS AND ALGORITHMS, 2004, : 209 - 212
  • [25] An Efficient FPGA-Based Dilated and Transposed Convolutional Neural Network Accelerator
    Wu, Tsung-Hsi
    Shu, Chang
    Liu, Tsung-Te
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (11) : 5178 - 5186
  • [26] Efficient FPGA-based Accelerator for Post-Processing in Object Detection
    Guo, Zibo
    Liu, Kai
    Liu, Wei
    Li, Shangrong
    2023 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, ICFPT, 2023, : 125 - 131
  • [27] An FPGA-Based Computation-Efficient Convolutional Neural Network Accelerator
    Archana, V. S.
    2022 IEEE INTERNATIONAL POWER AND RENEWABLE ENERGY CONFERENCE, IPRECON, 2022,
  • [28] QEGCN: An FPGA-based accelerator for quantized GCNs with edge-level parallelism
    Yuan, Wei
    Tian, Teng
    Wu, Qizhe
    Jin, Xi
    JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 129
  • [29] An FPGA-Based Energy-Efficient Reconfigurable Convolutional Neural Network Accelerator for Object Recognition Applications
    Li, Jixuan
    Un, Ka-Fai
    Yu, Wei-Han
    Mak, Pui-In
    Martins, Rui P.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (09) : 3143 - 3147
  • [30] Optimization of Energy Efficiency for FPGA-Based Convolutional Neural Networks Accelerator
    Tang, Yongming
    Dai, Rongshi
    Xie, Yi
    2020 4TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2020), 2020, 1487