H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices

被引:4
|
作者
Luo, Yandong [1 ,2 ,3 ]
Yu, Shimeng [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, 791 Atlantic Dr NW, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Atlanta, GA USA
[3] Apple, Cupertino, CA 95014 USA
关键词
Compute-in-memory; DNN accelerator; heterogeneous 3D integration; multi-head self-attention; transformer; MEMORY SRAM MACRO;
D O I
10.1145/3649219
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vi-sion models. The GB-class transformer models for natural language processing (NLP) impose challenges on existing accelerator design due to the massive number of parameters and the diverse matrix multiplication (MatMul) workloads involved. This work proposes a heterogeneous 3D-based accelerator design for trans-former models, which adopts an interposer substrate with multiple 3D memory/logic hybrid cubes optimized for accelerating different MatMul workloads. An approximate computing scheme is proposed to take advan-tage of heterogeneous computing paradigms of mixed-signal compute-in-memory (CIM) and digital tensor processing units (TPU). From the system-level evaluation results, 10 TOPS/W energy efficiency is achieved for the BERT and GPT2 model, which is about 2.6 x similar to 3.1 x higher than the baseline with 7 nm TPU and stacked FeFET memory.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] 3D Medical Axial Transformer: A Lightweight Transformer Model for 3D Brain Tumor Segmentation
    Liu, Cheng
    Kiryu, Hisanori
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 799 - 813
  • [2] HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration
    Dhingra, Pratyush
    Doppa, Janardhan Rao
    Pande, Partha Pratim
    PROCEEDINGS OF THE 29TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2024, 2024,
  • [3] 3D CORE VPI TRANSFORMER
    Lee, K. C.
    Duart, J. C.
    Xu, K.
    2017 INSUCON - 13TH INTERNATIONAL ELECTRICAL INSULATION CONFERENCE (INSUCON), 2017,
  • [4] Transformer for 3D Point Clouds
    Wang, Jiayun
    Chakraborty, Rudrasis
    Yu, Stella X.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4419 - 4431
  • [5] A 3D numerical model of an ONAN distribution transformer
    Cordoba, Paola A.
    Dadi, Enzo
    Silin, Nicolas
    APPLIED THERMAL ENGINEERING, 2019, 148 : 897 - 906
  • [6] MODEL H3D TIMER MEETS STRINGENT REQUIREMENTS.
    Tamefusa, Shigeru
    JEE, Journal of Electronic Engineering, 1982, 19 (190): : 86 - 88
  • [7] Pathformer3D: A 3D Scanpath Transformer for 360° Images
    Quan, Rong
    Lai, Yantao
    Qiu, Mengyu
    Liang, Dong
    COMPUTER VISION-ECCV 2024, PT XXXV, 2025, 15093 : 73 - 90
  • [8] Voxel Transformer for 3D Object Detection
    Mao, Jiageng
    Xue, Yujing
    Niu, Minzhe
    Bai, Haoyue
    Feng, Jiashi
    Liang, Xiaodan
    Xu, Hang
    Xu, Chunjing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3144 - 3153
  • [9] Bilateral transformer 3D planar recovery
    Ren, Fei
    Liao, Chunhua
    Xie, Zhina
    GRAPHICAL MODELS, 2024, 134
  • [10] SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation
    Perera, Shehan
    Navard, Pouyan
    Yilmaz, Alper
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, : 4981 - 4988