Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices

被引:0
|
作者
Du, Congpeng [1 ]
Ko, Seok-Bum [2 ]
Zhang, Hao [1 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Qingdao, Peoples R China
[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada
关键词
D O I
10.1109/ISCAS58744.2024.10558631
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning methods in many fields of applications, including edge computing. However, transformer models have even more amount of computations and parameters than convolutional neural networks which makes them challenging to be deployed at resource-constrained edge devices. To tackle this problem, in this paper, an efficient FPGA-based binary transformer accelerator is proposed. Within the proposed architecture, an energy efficient matrix multiplication decomposition method is proposed to reduce the amount of computation. Moreover, an efficient binarized Softmax computation method is also proposed to reduce the memory footprint during Softmax computation. The proposed architecture is implemented on Xilinx Zynq Untrascale+ device and implementation results show that the proposed matrix multiplication decomposition method can reduce up to 78% of computation at runtime. The proposed transformer accelerator can achieve improved throughput and energy efficiency compared to previous transformer accelerator designs.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Energy Efficient FPGA-Based Accelerator for Dynamic Sparse Transformer
    Li, Zuohao
    Lai, Yiwan
    Zhang, Hao
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 7 - 12
  • [2] Efficient FPGA-Based Transformer Accelerator Using In-Block Balanced Pruning
    Wang, Saiqun
    Zhang, Hao
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 18 - 23
  • [3] An Efficient FPGA-Based Accelerator Design for Convolution
    Song, Peng-Fei
    Pan, Jeng-Shyang
    Yang, Chun-Sheng
    Lee, Chiou-Yng
    2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2017, : 494 - 500
  • [4] An Efficient FPGA-based Accelerator for Deep Forest
    Zhu, Mingyu
    Luo, Jiapeng
    Mao, Wendong
    Wang, Zhongfeng
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 3334 - 3338
  • [5] Energy efficient spike transformer accelerator at the edge
    Congpeng Du
    Qi Wen
    Zhiqiang Wei
    Hao Zhang
    Intelligent Marine Technology and Systems, 2 (1):
  • [6] A FPGA-based Neural Accelerator for Small IoT Devices
    Hong, Seongmin
    Park, Yongjun
    PROCEEDINGS INTERNATIONAL SOC DESIGN CONFERENCE 2017 (ISOCC 2017), 2017, : 294 - 295
  • [7] An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation
    Xuan-Thuan Nguyen
    Trong-Thuc Hoang
    Hong-Thu Nguyen
    Katsumi Inoue
    Cong-Kha Pham
    IEEE ACCESS, 2018, 6 : 16046 - 16059
  • [8] Optimizing a FPGA-based Neural Accelerator for Small IoT Devices
    Hong, Seongmin
    Lee, Inho
    Park, Yongjun
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 176 - 177
  • [9] An FPGA-based Lightweight Deblocking CNN for Edge Devices
    Kim, Jaemyung
    Kang, Jin-Ku
    Kim, Yongwoo
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [10] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
    Li T.
    Zhang F.
    Wang S.
    Cao W.
    Chen L.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672