An Efficient Piecewise Linear Approximation of Non-linear Operations for Transformer Inference

被引:1
|
作者
Lu, Haodong [1 ]
Mei, Qichang [1 ]
Wang, Kun [1 ]
机构
[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai, Peoples R China
关键词
D O I
10.1109/FCCM57271.2023.00034
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models have achieved remarkable performance across various tasks, while the computational complexity presents an obstacle for deploying on resource-constrained devices. To this end, this paper proposes an efficient approximation framework termed NPLA for approximating non-linear operations during Transformer inference on hardware accelerators. Specifically, NPLA enables the approximation of non-linear operations using non-uniform piecewise linear functions and directly converts coefficients into LUTs for hardware implementation. Experimental results demonstrate that NPLA can reduce the hardware cost by 13.43x in LUTs and 1.98x in DSP compared to the state-of-the-art method.
引用
收藏
页码:206 / 206
页数:1
相关论文
共 50 条
  • [11] Applying Piecewise Linear Approximation for DNN Non-Linear Activation Functions to Bfloat16 MACs
    Kim, Seok Young
    Kim, Chang Hyun
    Kim, Seon Wook
    [J]. 2021 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2021,
  • [12] RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference
    Saha, Oindrila
    Kusupati, Aditya
    Simhadri, Harsha Vardhan
    Varma, Manik
    Jain, Prateek
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [13] Non-linear systemidentification using Hammerstein and non-linear feedback models with piecewise linear static maps
    Van Pelt, TH
    Bernstein, DS
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 2001, 74 (18) : 1807 - 1823
  • [14] Piecewise approximation of curves using non-linear diffusion in scale-space
    Pinheiro, AMG
    Ghanbari, M
    [J]. INTERNET MULTIMEDIA MANAGEMENT SYSTEMS, 2000, 4210 : 320 - 330
  • [15] Piecewise Volterra Series Approximation for Improved Non-Linear Channel Modelization and Detection
    Lucciardi, Jean-Alain
    Mesnager, Gilles
    Thomas, Nathalie
    Poulliat, Charly
    Boucheret, Marie-Laure
    Buscarlet, Guillaume
    [J]. 2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [16] PIECEWISE-LINEAR FUNCTION GENERATOR FOR NON-LINEAR CONVERTERS
    TROPIN, VV
    [J]. MEASUREMENT TECHNIQUES USSR, 1982, 25 (01): : 77 - 80
  • [18] Proliferation of non-linear excitations in the piecewise-linear perceptron
    Sclocchi, Antonio
    Urbani, Pierfrancesco
    [J]. SCIPOST PHYSICS, 2021, 10 (01):
  • [20] Linear approximation of the solutions of non-linear operator equations
    Bakushinskii, AB
    [J]. COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 1996, 36 (09) : 1169 - 1174