HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration

被引:0
|
作者
Dhingra, Pratyush [1 ]
Doppa, Janardhan Rao [1 ]
Pande, Partha Pratim [1 ]
机构
[1] Washington State Univ, Pullman, WA 99164 USA
基金
美国国家科学基金会;
关键词
Transformer; Heterogeneity; Accelerator; Thermal-aware; PIM;
D O I
10.1145/3665314.3670814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to the wide variety of computing kernels involved in the transformer architecture. Existing accelerators are either inadequate to accelerate end-to-end transformer models or suffer notable thermal limitations. In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. HeTraX employs hardware resources aligned with the computational kernels of transformers and optimizes both performance and energy. Experimental results show that HeTraX outperforms existing state-of-the-art by up to 5.6x in speedup and improves EDP by 14.5x while ensuring thermally feasibility.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
    Gao, Mingyu
    Pu, Jing
    Yang, Xuan
    Horowitz, Mark
    Kozyrakis, Christos
    TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII), 2017, : 751 - 764
  • [32] TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
    Gao, Mingyu
    Pu, Jing
    Yang, Xuan
    Horowitz, Mark
    Kozyrakis, Christos
    OPERATING SYSTEMS REVIEW, 2017, 51 (02) : 751 - 764
  • [33] TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
    Gao, Mingyu
    Pu, Jing
    Yang, Xuan
    Horowitz, Mark
    Kozyrakis, Christos
    ACM SIGPLAN NOTICES, 2017, 52 (04) : 751 - 764
  • [34] FNE-PCT: An Efficient Transformer Network for 3D Classification
    Han, Ming
    Sha, Jianjun
    Wang, Yanheng
    Ma, Chengyuan
    Zhang, Xiang
    PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2022), 2022, : 1046 - 1050
  • [35] Architecture of Graphics System with 3D Acceleration Support for Embedded Operating Systems
    Giatsintov, Alexander
    Mamrosenko, Kirill
    Bazhenov, Pavel
    TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (03): : 863 - 873
  • [36] Into the Third Dimension: Architecture Exploration Tools for 3D Reconfigurable Acceleration Devices
    Boutros, Andrew
    Mahmoudi, Fatemehsadat
    Mohaghegh, Amin
    More, Stephen
    Betz, Vaughn
    2023 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY, ICFPT, 2023, : 198 - 208
  • [37] DAST: Differentiable Architecture Search with Transformer for 3D Medical Image Segmentation
    Yang, Dong
    Xu, Ziyue
    He, Yufan
    Nath, Vishwesh
    Li, Wenqi
    Myronenko, Andriy
    Hatamizadeh, Ali
    Zhao, Can
    Roth, Holger R.
    Xu, Daguang
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT III, 2023, 14222 : 747 - 756
  • [38] Sort middle pipeline architecture for efficient 3D rendering
    Falchetto, M.
    Barone, M.
    Pau, D.
    Hill, S.
    Goda, S.
    ICCE: 2007 DIGEST OF TECHNICAL PAPERS INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2007, : 413 - +
  • [39] An efficient parallel architecture for 3D PET image reconstruction
    Chen, CM
    Wang, CY
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 910 - 913
  • [40] MOELA: A Multi-Objective Evolutionary/Learning Design Space Exploration Framework for 3D Heterogeneous Manycore Platforms
    Qi, Sirui
    Li, Yingheng
    Pasricha, Sudeep
    Kim, Ryan Gary
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,