H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices

被引:4
|
作者
Luo, Yandong [1 ,2 ,3 ]
Yu, Shimeng [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, 791 Atlantic Dr NW, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Atlanta, GA USA
[3] Apple, Cupertino, CA 95014 USA
关键词
Compute-in-memory; DNN accelerator; heterogeneous 3D integration; multi-head self-attention; transformer; MEMORY SRAM MACRO;
D O I
10.1145/3649219
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vi-sion models. The GB-class transformer models for natural language processing (NLP) impose challenges on existing accelerator design due to the massive number of parameters and the diverse matrix multiplication (MatMul) workloads involved. This work proposes a heterogeneous 3D-based accelerator design for trans-former models, which adopts an interposer substrate with multiple 3D memory/logic hybrid cubes optimized for accelerating different MatMul workloads. An approximate computing scheme is proposed to take advan-tage of heterogeneous computing paradigms of mixed-signal compute-in-memory (CIM) and digital tensor processing units (TPU). From the system-level evaluation results, 10 TOPS/W energy efficiency is achieved for the BERT and GPT2 model, which is about 2.6 x similar to 3.1 x higher than the baseline with 7 nm TPU and stacked FeFET memory.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Hardware 3D graphics acceleration for mobile devices
    Olson, Thomas J.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5344 - 5347
  • [42] 3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction
    Zeng, Pinxian
    Zhou, Luping
    Zu, Chen
    Zeng, Xinyi
    Jiao, Zhengyang
    Wu, Xi
    Zhou, Jiliu
    Shen, Dinggang
    Wang, Yan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VI, 2022, 13436 : 516 - 526
  • [43] Transformer Based Multi-model Fusion for 3D Facial Animation
    Chen, Benwang
    Luo, Chunshui
    Wang, Haoqian
    2023 2ND CONFERENCE ON FULLY ACTUATED SYSTEM THEORY AND APPLICATIONS, CFASTA, 2023, : 659 - 663
  • [44] Li3DeTr: A LiDAR based 3D Detection Transformer
    Erabati, Gopi Krishna
    Araujo, Helder
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4239 - 4248
  • [45] Derivation of a Low-Frequency Model for a 3D Wound Core Transformer
    Elhaminia, Pedram
    Hajipour, Ehsan
    Moradnouri, Ahmad
    Vakilian, Mehdi
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1319 - 1323
  • [46] Efficient 3D Molecular Design with an E(3) Invariant Transformer VAE
    Dollar, Orion
    Joshi, Nisarg
    Pfaendtner, Jim
    Beck, David A. C.
    JOURNAL OF PHYSICAL CHEMISTRY A, 2023, 127 (37): : 7844 - 7852
  • [47] Characterization of the H3D ASIC Readout System and 6.0 cm3 3-D Position Sensitive CdZnTe Detectors
    Zhang, Feng
    Herman, Cedric
    He, Zhong
    De Geronimo, Gianluigi
    Vernon, Emerson
    Fried, Jack
    IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2012, 59 (01) : 236 - 242
  • [48] MeT: A graph transformer for semantic segmentation of 3D meshes
    Vecchio, Giuseppe
    Prezzavento, Luca
    Pino, Carmelo
    Rundo, Francesco
    Palazzo, Simone
    Spampinato, Concetto
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 235
  • [49] Multimodal Transformer for Automatic 3D Annotation and Object Detection
    Liu, Chang
    Qian, Xiaoyan
    Huang, Binxiao
    Qi, Xiaojuan
    Lam, Edmund
    Tan, Siew-Chong
    Wong, Ngai
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
  • [50] 3D reconstruction of digital rocks based on StyleGAN and transformer
    Ting Zhang
    Wenqing Zhang
    International Journal of Coal Science & Technology, 2025, 12 (1)