An Efficient Piecewise Linear Approximation of Non-linear Operations for Transformer Inference

被引:1
|
作者
Lu, Haodong [1 ]
Mei, Qichang [1 ]
Wang, Kun [1 ]
机构
[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai, Peoples R China
关键词
D O I
10.1109/FCCM57271.2023.00034
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models have achieved remarkable performance across various tasks, while the computational complexity presents an obstacle for deploying on resource-constrained devices. To this end, this paper proposes an efficient approximation framework termed NPLA for approximating non-linear operations during Transformer inference on hardware accelerators. Specifically, NPLA enables the approximation of non-linear operations using non-uniform piecewise linear functions and directly converts coefficients into LUTs for hardware implementation. Experimental results demonstrate that NPLA can reduce the hardware cost by 13.43x in LUTs and 1.98x in DSP compared to the state-of-the-art method.
引用
收藏
页码:206 / 206
页数:1
相关论文
共 50 条
  • [31] INTERPOLATION SPACES AND NON-LINEAR APPROXIMATION
    DEVORE, RA
    POPOV, VA
    [J]. LECTURE NOTES IN MATHEMATICS, 1988, 1302 : 191 - 205
  • [32] A non-linear approximation method on the sphere
    Michel V.
    Telschow R.
    [J]. GEM - International Journal on Geomathematics, 2014, 5 (2) : 195 - 224
  • [33] A problem in non-linear Diophantine approximation
    Harrap, Stephen
    Hussain, Mumtaz
    Kristensen, Simon
    [J]. NONLINEARITY, 2018, 31 (04) : 1734 - 1756
  • [34] Linear and non-linear image processing operations on digital projections
    Svalbe, I
    [J]. MATHEMATICAL MORPHOLOGY, PROCEEDINGS, 2002, : 165 - 174
  • [35] LONG WAVE APPROXIMATION, LINEAR AND NON-LINEAR ROSSBY WAVES
    伍荣生
    [J]. Science China Chemistry, 1986, Ser.B.1986 (03) : 302 - 312
  • [36] Efficient metacomputing of elliptic linear and non-linear problems
    Barberou, N
    Garbey, M
    Hess, M
    Resch, MM
    Rossi, T
    Toivanen, J
    Tromeur-Dervout, D
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2003, 63 (05) : 564 - 577
  • [37] Derating of a distribution transformer for non-linear loads
    Sharifian, MBB
    Faiz, J
    [J]. EUROPEAN TRANSACTIONS ON ELECTRICAL POWER, 2006, 16 (02): : 189 - 203
  • [39] A simple algorithm for efficient piecewise linear approximation of space curves
    Horst, JA
    Beichl, I
    [J]. INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL II, 1997, : 744 - 747
  • [40] Piecewise-linear approximations for a non-linear transmission expansion planning problem
    Camponogara, Eduardo
    de Almeida, Katia Campos
    Hardt Junior, Rubens
    [J]. IET GENERATION TRANSMISSION & DISTRIBUTION, 2015, 9 (12) : 1235 - 1244