An Efficient Piecewise Linear Approximation of Non-linear Operations for Transformer Inference

被引:1
|
作者
Lu, Haodong [1 ]
Mei, Qichang [1 ]
Wang, Kun [1 ]
机构
[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai, Peoples R China
关键词
D O I
10.1109/FCCM57271.2023.00034
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models have achieved remarkable performance across various tasks, while the computational complexity presents an obstacle for deploying on resource-constrained devices. To this end, this paper proposes an efficient approximation framework termed NPLA for approximating non-linear operations during Transformer inference on hardware accelerators. Specifically, NPLA enables the approximation of non-linear operations using non-uniform piecewise linear functions and directly converts coefficients into LUTs for hardware implementation. Experimental results demonstrate that NPLA can reduce the hardware cost by 13.43x in LUTs and 1.98x in DSP compared to the state-of-the-art method.
引用
收藏
页码:206 / 206
页数:1
相关论文
共 50 条
  • [1] NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference
    Yu, Joonsang
    Park, Junki
    Park, Seongmin
    Kim, Minsoo
    Lee, Sihwa
    Lee, Dong Hyun
    Choi, Jungwook
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 577 - 582
  • [2] RECURSIVE PIECEWISE-LINEAR APPROXIMATION METHODS FOR NON-LINEAR NETWORKS
    MEYER, RR
    [J]. LECTURE NOTES IN ECONOMICS AND MATHEMATICAL SYSTEMS, 1982, 199 : 315 - 322
  • [3] Efficient approximation for linear and non-linear signal representation
    Bilgehan, Buelent
    [J]. IET SIGNAL PROCESSING, 2015, 9 (03) : 260 - 266
  • [4] APPROXIMATION TECHNIQUE FOR NON-LINEAR INTEGRAL OPERATIONS
    HELTON, J
    STUCKWISCH, S
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1978, 65 (02) : 365 - 374
  • [5] Efficient continuous piecewise linear regression for linearising univariate non-linear functions
    Warwicker, John Alasdair
    Rebennack, Steffen
    [J]. IISE TRANSACTIONS, 2024,
  • [6] PIECEWISE NON-LINEAR HOMOTOPIES
    LENTINI, M
    REINOZA, A
    [J]. LECTURE NOTES IN MATHEMATICS, 1983, 1005 : 162 - 169
  • [7] An efficient piecewise-linear DC analysis method for general non-linear circuits
    Roos, J
    Valtonen, M
    [J]. INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 1999, 27 (03) : 311 - 330
  • [8] NON-LINEAR REGULATION - THE PIECEWISE LINEAR-APPROACH
    SONTAG, ED
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1981, 26 (02) : 346 - 358
  • [9] Range-Invariant Approximation of Non-Linear Operations for Efficient BERT Fine-Tuning
    Kim, Janghyeon
    Lee, Janghwan
    Choi, Jungwook
    Han, JeongHo
    Lee, Sangheon
    [J]. 2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [10] On linear transform design with non-linear approximation
    Sezer, Osman G.
    Guleryuz, Onur G.
    [J]. WAVELETS AND SPARSITY XV, 2013, 8858