Transformer-OPU: An FPGA-based Overlay Processor for Transformer Networks

被引:6
|
作者
Bai, Yueyin [1 ]
Zhou, Hao [1 ]
Zhao, Keqing [1 ]
Chen, Jianli [1 ]
Yu, Jun [1 ]
Wang, Kun [1 ]
机构
[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai, Peoples R China
关键词
D O I
10.1109/FCCM57271.2023.00049
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Existing implementations of transformer networks by field-programmable gate array (FPGA) focus only on attention computation, or suffer from fixed model structure without flexibility. In this article, we propose an FPGA-based overlay processor, named Transformer-OPU for general accelerations of transformer networks. Experimental result shows that our Transformer-OPU achieves 5.19-15.06x and 1.14-2.89x speedup compared with CPU and GPU, respectively. We also observe 1.10-2.47x better latency compared with previously customized FPGA accelerators, and is 1.45x faster than NPE.
引用
收藏
页码:222 / 222
页数:1
相关论文
共 50 条
  • [21] A Cost-Efficient FPGA-Based CNN-Transformer Using Neural ODE
    Okubo, Ikumi
    Sugiura, Keisuke
    Matsutani, Hiroki
    IEEE ACCESS, 2024, 12 : 155773 - 155788
  • [22] FPGA-based implementation of a serial RSA processor
    Mazzeo, A
    Romano, L
    Saggese, GR
    Mazzocca, N
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, PROCEEDINGS, 2003, : 582 - 587
  • [23] An FPGA-based processor for shogi mating problems
    Hori, Y
    Sonoyama, M
    Maruyama, T
    2002 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2002, : 117 - 124
  • [24] FPGA-based Annealing Processor for Ising Model
    Yoshimura, Chihiro
    Hayashi, Masato
    Okuyama, Takuya
    Yamaoka, Masanao
    2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2016, : 436 - 442
  • [25] FPGA-BASED MULTI-CORE PROCESSOR
    Wojcik, Wojciech
    Dlugopolski, Jacek
    COMPUTER SCIENCE-AGH, 2013, 14 (03): : 459 - 474
  • [26] An FPGA-based specific processor for Blokus Duo
    Olivito, Javier
    Gonzalez, Carlos
    Resano, Javier
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2013, : 502 - 505
  • [27] SHARF: AN FPGA-BASED CUSTOMIZABLE PROCESSOR ARCHITECTURE
    Bassoy, Cem Savas
    Manteuffel, Henning
    Mayer-Lindenberg, Friedrich
    FPL: 2009 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, 2009, : 516 - 520
  • [28] An FPGA-based singular value decomposition processor
    Ma, Weiwei
    Kaye, M. E.
    Luke, D. M.
    Doraiswami, R.
    2006 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-5, 2006, : 1253 - +
  • [29] An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications
    Cao, Rujian
    Zhao, Zhongyu
    Un, Ka-Fai
    Yu, Wei-Han
    Martins, Rui P.
    Mak, Pui-In
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (11) : 4688 - 4692
  • [30] An FPGA-Based Transformer Accelerator Using Output Block Stationary Dataflow for Object Recognition Applications
    Zhao, Zhongyu
    Cao, Rujian
    Un, Ka-Fai
    Yu, Wei-Han
    Mak, Pui-In
    Martins, Rui P.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (01) : 281 - 285