Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices

被引:0
|
作者
Du, Congpeng [1 ]
Ko, Seok-Bum [2 ]
Zhang, Hao [1 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Qingdao, Peoples R China
[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada
关键词
D O I
10.1109/ISCAS58744.2024.10558631
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning methods in many fields of applications, including edge computing. However, transformer models have even more amount of computations and parameters than convolutional neural networks which makes them challenging to be deployed at resource-constrained edge devices. To tackle this problem, in this paper, an efficient FPGA-based binary transformer accelerator is proposed. Within the proposed architecture, an energy efficient matrix multiplication decomposition method is proposed to reduce the amount of computation. Moreover, an efficient binarized Softmax computation method is also proposed to reduce the memory footprint during Softmax computation. The proposed architecture is implemented on Xilinx Zynq Untrascale+ device and implementation results show that the proposed matrix multiplication decomposition method can reduce up to 78% of computation at runtime. The proposed transformer accelerator can achieve improved throughput and energy efficiency compared to previous transformer accelerator designs.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] FPGA-Based Vehicle Detection and Tracking Accelerator
    Zhai, Jiaqi
    Li, Bin
    Lv, Shunsen
    Zhou, Qinglei
    SENSORS, 2023, 23 (04)
  • [42] An FPGA-based Hardware Accelerator for Iris Segmentation
    Avey, Joe
    Jones, Phillip
    Zambreno, Joseph
    2018 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2018,
  • [43] An FPGA-Based YOLOv6 Accelerator for High-Throughput and Energy-Efficient Object Detection
    Sha, Xingan
    Yanagisawa, Masao
    Shi, Youhua
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2025, E108A (03) : 473 - 481
  • [44] An FPGA-based Integrated MapReduce Accelerator Platform
    Kachris, Christoforos
    Diamantopoulos, Dionysios
    Sirakoulis, Georgios Ch.
    Soudris, Dimitrios
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2017, 87 (03): : 357 - 369
  • [45] Efficient FPGA-Based Convolutional Neural Network Implementation for Edge Computing
    Cuong, Pham-Quoc
    Thinh, Tran Ngoc
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (03) : 479 - 487
  • [46] Scalable and Efficient Architecture for Random Forest on FPGA-Based Edge Computing
    Cuong Pham-Quoc
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 42 - 54
  • [47] FPGA-Based Programmable Accelerator for Hybrid Processing
    Stefan, Gheorghe M.
    Bira, Calin
    Hobincu, Radu
    Malita, Mihaela
    ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY, 2016, 19 (1-2): : 148 - 165
  • [48] An Energy-Efficient FPGA-Based Deconvolutional Neural Networks Accelerator for Single Image Super-Resolution
    Chang, Jung-Woo
    Kang, Keon-Woo
    Kang, Suk-Ju
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (01) : 281 - 295
  • [49] An FPGA-Based Accelerator for Frequent Itemset Mining
    Zhang, Yan
    Zhang, Fan
    Jin, Zheming
    Bakos, Jason D.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2013, 6 (01)
  • [50] Packet Filtering for FPGA-Based Routing Accelerator
    Antos, David
    Rehak, Vojtech
    Holub, Petr
    CESNET CONFERENCE 2006: FIRST CESNET CONFERENCE ON ADVANCED COMMUNICATIONS AND GRIDS, 2006, : 161 - 173