H3D-Transformer: A Heterogeneous 3D (H3D) Computing Platform for Transformer Model Acceleration on Edge Devices

被引:4
|
作者
Luo, Yandong [1 ,2 ,3 ]
Yu, Shimeng [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, 791 Atlantic Dr NW, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Atlanta, GA USA
[3] Apple, Cupertino, CA 95014 USA
关键词
Compute-in-memory; DNN accelerator; heterogeneous 3D integration; multi-head self-attention; transformer; MEMORY SRAM MACRO;
D O I
10.1145/3649219
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Prior hardware accelerator designs primarily focused on single-chip solutions for 10 MB-class computer vi-sion models. The GB-class transformer models for natural language processing (NLP) impose challenges on existing accelerator design due to the massive number of parameters and the diverse matrix multiplication (MatMul) workloads involved. This work proposes a heterogeneous 3D-based accelerator design for trans-former models, which adopts an interposer substrate with multiple 3D memory/logic hybrid cubes optimized for accelerating different MatMul workloads. An approximate computing scheme is proposed to take advan-tage of heterogeneous computing paradigms of mixed-signal compute-in-memory (CIM) and digital tensor processing units (TPU). From the system-level evaluation results, 10 TOPS/W energy efficiency is achieved for the BERT and GPT2 model, which is about 2.6 x similar to 3.1 x higher than the baseline with 7 nm TPU and stacked FeFET memory.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Superpoint Transformer for 3D Scene Instance Segmentation
    Sun, Jiahao
    Qing, Chunmei
    Tan, Junpeng
    Xu, Xiangmin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2393 - 2401
  • [22] The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes
    Patil, Abhishek
    Malla, Srikanth
    Gang, Haiming
    Chen, Yi-Ting
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 9552 - 9557
  • [23] H3D四声道声卡
    隋易
    上海微型计算机, 1999, (32) : 28 - 28
  • [24] Query Refinement Transformer for 3D Instance Segmentation
    Lu, Jiahao
    Deng, Jiacheng
    Wang, Chuxin
    He, Jianfeng
    Zhang, Tianzhu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18470 - 18480
  • [25] Transient states analysis of 3D transformer structure
    Komeza, K.
    Welfle, H.
    Wiak, S.
    COMPEL - The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, 1998, 17 (02): : 252 - 256
  • [26] Residual Transformer Network for 3D Objects Classification
    Meng, Shan
    Liang, Daoyuan
    Li, Yumei
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 1175 - 1179
  • [27] Efficient 3D Semantic Segmentation with Superpoint Transformer
    Robert, Damien
    Raguet, Hugo
    Landrieu, Loic
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17149 - 17158
  • [28] 3D Morphable Models as Spatial Transformer Networks
    Bas, Anil
    Huber, Patrik
    Smith, William A. P.
    Awais, Muhammad
    Kittler, Josef
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 895 - 903
  • [29] GPU Acceleration of 3D Eddy Current Losses Calculation in Large Power Transformer
    Wu, Dongyang
    Yan, Xiuke
    Tang, Renyuan
    Xie, Dexin
    Ren, Ziyan
    Bai, Baodong
    2016 IEEE CONFERENCE ON ELECTROMAGNETIC FIELD COMPUTATION (CEFC), 2016,
  • [30] Transient states analysis of 3D transformer structure
    COMPEL Int J Comput Math Electr Electron Eng, 1-3 (252-256):