Optimizing Accelerator Configurability for Mobile Transformer Networks

被引:2
|
作者
Colleman, Steven [1 ]
Zhu, Peter [2 ]
Sun, Wei [2 ]
Verhelst, Marian [1 ]
机构
[1] KULeuven, Dept Elect Engn, MICAS ESAT, Leuven, Belgium
[2] OPPO Elect, Dongguan, Peoples R China
关键词
CNN; transformer; cross-layer; configurability; hardware modelling and optimization;
D O I
10.1109/AICAS54282.2022.9869945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformers are increasingly used to process time series data. Their deployment in mobile devices is however challenging due to their large computational requirements. Hardware acceleration through custom neural network processors can reduce the resulting latency and energy footprint of the network. Yet, the large variety of different layer types in mobile transformers troubles the selection of the best accelerator architecture and hardware mapping. Specifically, the layer performance strongly depends on the spatial unrolling dimensions, i.e. the layer loop dimensions along which hardware parallelization is enabled. This paper will therefore research the best datapath organization, and required datapath flexibility to efficiently support mobile transformers. The MobileViT-S network is selected as the reference network for its wide variety in layer types. Results are explored across a wide range of accelerator area (datapath dimensions, memory size) and bandwidth constraints.
引用
收藏
页码:142 / 145
页数:4
相关论文
共 50 条
  • [1] Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms
    Colleman, Steven
    Symons, Arne
    Jung, Victor J. B.
    Verhelst, Marian
    [J]. 2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [2] An Efficient Hardware Accelerator for Sparse Transformer Neural Networks
    Fang, Chao
    Guo, Shouliang
    Wu, Wei
    Lin, Jun
    Wang, Zhongfeng
    Hsu, Ming Kai
    Liu, Lingzhi
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2670 - 2674
  • [3] Optimizing Accelerator on FPGA for Deep Convolutional Neural Networks
    Dong, Yong
    Hu, Wei
    Wang, Yonghao
    Jiao, Qiang
    Chen, Shuang
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 97 - 110
  • [4] A Refactoring Approach for Optimizing Mobile Networks
    Pozza, Matteo
    Rao, Ashwin
    Bujari, Armir
    Flinck, Hannu
    Palazzi, Claudio E.
    Tarkoma, Sasu
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2017,
  • [5] In-Memory Computing based Accelerator for Transformer Networks for Long Sequences
    Laguna, Ann Franchesca
    Kazemi, Arman
    Niemier, Michael
    Hu, X. Sharon
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1839 - 1844
  • [6] Mobile Accelerator: A New Approach to Improve TCP Performance in Mobile Data Networks
    Liu, Ke
    Lee, Jack Y. B.
    [J]. 2011 7TH INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING CONFERENCE (IWCMC), 2011, : 2174 - 2180
  • [7] Distribution of intelligence and configurability in wireless surveillance networks
    Sacchi, C
    Gera, G
    Regazzoni, CS
    [J]. SOFTWARE RADIO: TECHNOLOGIES AND SERVICES, 2001, : 129 - 142
  • [8] Optimizing convolutional neural networks on multi-core vector accelerator
    Liu, Zhong
    Xiao, Xin
    Li, Chen
    Ma, Sheng
    Rangyu, Deng
    [J]. PARALLEL COMPUTING, 2022, 112
  • [9] Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator
    Ferianc, Martin
    Que, Zhiqiang
    Fan, Hongxiang
    Luk, Wayne
    Rodrigues, Miguel
    [J]. 2021 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT), 2021, : 19 - 28
  • [10] COAC: Cross-Layer Optimization of Accelerator Configurability for Efficient CNN Processing
    Colleman, Steven
    Shi, Man
    Verhelst, Marian
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (07) : 945 - 958