HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration

被引:0
|
作者
Dhingra, Pratyush [1 ]
Doppa, Janardhan Rao [1 ]
Pande, Partha Pratim [1 ]
机构
[1] Washington State Univ, Pullman, WA 99164 USA
基金
美国国家科学基金会;
关键词
Transformer; Heterogeneity; Accelerator; Thermal-aware; PIM;
D O I
10.1145/3665314.3670814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to the wide variety of computing kernels involved in the transformer architecture. Existing accelerators are either inadequate to accelerate end-to-end transformer models or suffer notable thermal limitations. In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. HeTraX employs hardware resources aligned with the computational kernels of transformers and optimizes both performance and energy. Experimental results show that HeTraX outperforms existing state-of-the-art by up to 5.6x in speedup and improves EDP by 14.5x while ensuring thermally feasibility.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] M3D-ADTCO: Monolithic 3D Architecture, Design and Technology Co-Optimization for High Energy Efficient 3D IC
    Thuries, Sebastien
    Billoint, Olivier
    Choisnet, Sylvain
    Lemaire, Romain
    Vivet, Pascal
    Batude, Perrine
    Lattard, Didier
    PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 1740 - 1745
  • [22] An efficient heterogeneous parallel algorithm of the 3D MOC for multizone heterogeneous systems
    Li, Runhua
    Liu, Jie
    Zhang, Guangchun
    Gong, Chunye
    Yang, Bo
    Liang, Yuechao
    COMPUTER PHYSICS COMMUNICATIONS, 2023, 292
  • [23] Energy Efficient 3D Hybrid Processor-Memory Architecture for the Dark Silicon Age
    Niknam, Sobhan
    Asad, Arghavan
    Fathy, Mahmood
    Rahmani, Amir-Mohammad
    2015 10TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC), 2015,
  • [24] Efficient 3D Molecular Design with an E(3) Invariant Transformer VAE
    Dollar, Orion
    Joshi, Nisarg
    Pfaendtner, Jim
    Beck, David A. C.
    JOURNAL OF PHYSICAL CHEMISTRY A, 2023, 127 (37): : 7844 - 7852
  • [25] Molformer: Motif-Based Transformer on 3D Heterogeneous Molecular Graphs
    Wu, Fang
    Radev, Dragomir
    Li, Stan Z.
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5312 - 5320
  • [26] An efficient 3D grid based scheduling for heterogeneous systems
    Chronopoulos, AT
    Grosu, D
    Wissink, AM
    Benche, M
    Liu, JY
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2003, 63 (09) : 827 - 837
  • [27] Reconstruction and Efficient Visualization of Heterogeneous 3D City Models
    Buyukdemircioglu, Mehmet
    Kocaman, Sultan
    REMOTE SENSING, 2020, 12 (13)
  • [28] GPU Acceleration of 3D Eddy Current Losses Calculation in Large Power Transformer
    Wu, Dongyang
    Yan, Xiuke
    Tang, Renyuan
    Xie, Dexin
    Ren, Ziyan
    Bai, Baodong
    2016 IEEE CONFERENCE ON ELECTROMAGNETIC FIELD COMPUTATION (CEFC), 2016,
  • [29] Transformer: Run-time reprogrammable heterogeneous architecture for transparent acceleration of dynamic workloads
    Li, Peilong
    Luo, Yan
    Yang, Jun
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2015, 86 : 45 - 61
  • [30] TETRIS: Scalable and efficient neural network acceleration with 3D memory
    Gao M.
    Pu J.
    Yang X.
    Horowitz M.
    Kozyrakis C.
    1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (52): : 751 - 764