HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration

被引:0
|
作者
Dhingra, Pratyush [1 ]
Doppa, Janardhan Rao [1 ]
Pande, Partha Pratim [1 ]
机构
[1] Washington State Univ, Pullman, WA 99164 USA
基金
美国国家科学基金会;
关键词
Transformer; Heterogeneity; Accelerator; Thermal-aware; PIM;
D O I
10.1145/3665314.3670814
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to the wide variety of computing kernels involved in the transformer architecture. Existing accelerators are either inadequate to accelerate end-to-end transformer models or suffer notable thermal limitations. In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. HeTraX employs hardware resources aligned with the computational kernels of transformers and optimizes both performance and energy. Experimental results show that HeTraX outperforms existing state-of-the-art by up to 5.6x in speedup and improves EDP by 14.5x while ensuring thermally feasibility.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] HeM3D: Heterogeneous Manycore Architecture Based on Monolithic 3D Vertical Integration
    Arka, Aqeeb Iqbal
    Joardar, Biresh Kumar
    Kim, Ryan Gary
    Kim, Dae Hyun
    Doppa, Janardhan Rao
    Pande, Partha Pratim
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 26 (02)
  • [2] High-Performance and Energy-Efficient 3D Manycore GPU Architecture for Accelerating Graph Analytics
    Choudhury, Dwaipayan
    Rajam, Aravind Sukumaran
    Kalyanaraman, Ananth
    Pande, Partha Pratim
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (01)
  • [3] H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices
    Luo, Yandong
    Yu, Shimeng
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (03)
  • [4] PZnet: Efficient 3D ConvNet Inference on Manycore CPUs
    Popovych, Sergiy
    Buniatyan, Davit
    Zlateski, Aleksandar
    Li, Kai
    Seung, H. Sebastian
    ADVANCES IN COMPUTER VISION, CVC, VOL 1, 2020, 943 : 369 - 383
  • [5] An Energy-Efficient Reliable Heterogeneous Uncore Architecture for Future 3D Chip-Multiprocessors
    Asad, Arghavan
    Fazeli, Mahdi
    Jahed-Motlagh, Mohammad Reza
    Fathy, Mahmood
    Mohammadi, Farah
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28 (13)
  • [6] Mesoporous Gallosilicate with 3D Architecture as a Robust Energy-Efficient Heterogeneous Catalyst for Diphenylmethane Production
    Anand, Chokkalingam
    Joseph, Stalin
    Lawrence, Geoffrey
    Dhawale, Dattatray S.
    Wahab, Md. Abdul
    Choy, Jin-Ho
    Vinu, Ajayan
    CHEMCATCHEM, 2013, 5 (07) : 1863 - 1870
  • [7] An Object Detection Acceleration Framework Based on Low-Power Heterogeneous Manycore Architecture
    Gao, Fang
    Huang, Zhangqin
    Wang, Zheng
    Wang, Shulong
    2016 IEEE 3RD WORLD FORUM ON INTERNET OF THINGS (WF-IOT), 2016, : 597 - 602
  • [8] A Monolithic 3D Hybrid Architecture for Energy-Efficient Computation
    Yu, Ye
    Jha, Niraj K.
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2018, 4 (04): : 533 - 547
  • [9] An Energy Efficient 3D-Heterogeneous Main Memory Architecture for Mobile Devices
    Mathew, Deepak M.
    Prado, Felipe S.
    Zulian, Eder F.
    Weis, Christian
    Ghaffar, Muhammad Mohsin
    Wehn, Norbert
    Jung, Matthias
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, MEMSYS 2020, 2020, : 114 - 125
  • [10] HeNCoG: A Heterogeneous Near-memory Computing Architecture for Energy Efficient GCN Acceleration
    Hwang, Seung-Eon
    Song, Duyeong
    Park, Jongsun
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,