HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration

被引:0
|
作者
Dhingra, Pratyush [1 ]
Doppa, Janardhan Rao [1 ]
Pande, Partha Pratim [1 ]
机构
[1] Washington State Univ, Pullman, WA 99164 USA
基金
美国国家科学基金会;
关键词
Transformer; Heterogeneity; Accelerator; Thermal-aware; PIM;
D O I
10.1145/3665314.3670814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to the wide variety of computing kernels involved in the transformer architecture. Existing accelerators are either inadequate to accelerate end-to-end transformer models or suffer notable thermal limitations. In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. HeTraX employs hardware resources aligned with the computational kernels of transformers and optimizes both performance and energy. Experimental results show that HeTraX outperforms existing state-of-the-art by up to 5.6x in speedup and improves EDP by 14.5x while ensuring thermally feasibility.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] 3D Integration for Energy Efficient System Design
    Borkar, Shekhar
    2009 SYMPOSIUM ON VLSI TECHNOLOGY, DIGEST OF TECHNICAL PAPERS, 2009, : 58 - 59
  • [42] 3D Integration for Energy Efficient System Design
    Borkar, Shekhar
    PROCEEDINGS OF THE 48TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2011, : 214 - 219
  • [43] Energy-Efficient Adaptive 3D Sensing
    Tilmon, Brevin
    Sun, Zhanghao
    Koppal, Sanjeev J.
    Wu, Yicheng
    Evangelidis, Georgios
    Zahreddine, Ramzi
    Krishnan, Gurunandan
    Ma, Sizhuo
    Wang, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5054 - 5063
  • [44] Energy Efficient Approximate 3D Image Reconstruction
    Wu, Yun
    Asmann, Andreas
    Stewart, Brian D. D.
    Wallace, Andrew M. M.
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (04) : 1854 - 1866
  • [45] Efficient routing techniques in heterogeneous 3D Networks-on-Chip
    Agyeman, Michael Opoku
    Ahmadinia, Ali
    Shahrabi, Alireza
    PARALLEL COMPUTING, 2013, 39 (09) : 389 - 407
  • [46] A Systematic Generation of Optimized Heterogeneous 3D Networks-on-Chip Architecture
    Agyeman, Michael Opoku
    Ahmadinia, Ali
    2013 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), 2013, : 79 - 83
  • [47] Slice coherence in a query-based architecture for 3D heterogeneous printing
    Yaman, Ulas
    Butt, Nabeel
    Sacks, Elisha
    Hoffmann, Christoph
    COMPUTER-AIDED DESIGN, 2016, 75-76 : 27 - 38
  • [48] 3D Medical Axial Transformer: A Lightweight Transformer Model for 3D Brain Tumor Segmentation
    Liu, Cheng
    Kiryu, Hisanori
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 799 - 813
  • [49] Efficient Parallel Inflated 3D Convolution Architecture for Action Recognition
    Huang, Yukun
    Guo, Yongcai
    Gao, Chao
    IEEE ACCESS, 2020, 8 : 45753 - 45765
  • [50] A 5.99 TFLOPS/W Heterogeneous CIM-NPU Architecture for an Energy Efficient Floating-Point DNN Acceleration
    Park, Wonhoon
    Ryu, Junha
    Kim, Sangjin
    Um, Soyeon
    Jo, Wooyoung
    Kim, Sangyoeb
    Yoo, Hoi-Jun
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,