Vision Transformer-based overlay processor for Edge Computing

被引：0

作者：

Liu, Fang ^{[1
,2
]}

Fan, Zimeng ^{[1
]}

Hu, Wei ^{[3
]}

Xu, Dian ^{[3
]}

Peng, Min ^{[1
]}

He, Jing ^{[4
]}

He, Yanxiang ^{[1
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Wuhan Inst City, Informat Engn Dept, Wuhan, Peoples R China

[3] Wuhan Univ Sci & Technol, Coll Comp Sci, Wuhan, Peoples R China

[4] Kennesaw State Univ, Dept Comp Sci, Marietta, KS USA

来源：

APPLIED SOFT COMPUTING | 2024年 / 156卷

基金：

中国国家自然科学基金;

关键词：

Edge computing; Transformer; Neural networks; Overlay processor; OPU;

D O I：

10.1016/j.asoc.2024.111421

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accelerating Visual Neural Networks in Edge Computing environments is crucial for processing image and video data. Visual Neural Networks, including Convolutional Neural Networks and Vision Transformers, are central to image recognition, video analysis, and object detection tasks. Deploying these networks on edge devices and accelerating them can significantly enhance data processing speed and efficiency. The large number of parameters, complex computational flows, and various structural variants of Transformer models present both opportunities and challenges. We propose Vis-TOP (Vision Transformer Overlay Processor) , an overlay processor designed for all types of Vision Transformer models. Vis-TOP, distinct from coarse -grained general-purpose accelerators like GPUs and fine-grained custom designs, encapsulates Vision Transformer characteristics into a three -layer, two -level mapping structure, enabling flexible model switching without hardware architecture modifications. Concurrently, we designed a corresponding instruction bundle and hardware architecture within this mapping structure. We implemented the overlay processor design on the ZCU102 after quantizing the Swin Transformer model to 8 -bit fixed points (fix_8). Experimentally, our throughput surpasses GPU implementation by 1.5 times. Our throughput per DSP is 2.2 to 11.7 times higher than that of existing Transformer -like accelerators. Overall, our approach satisfies real-time AI requirements in resource consumption and inference speed. Vis-TOP offers a cost-effective image processing solution for Edge Computing on reconfigurable devices, enhancing computational resource utilization, saving data transfer time and costs, and reducing latency.

引用

页数：16

共 50 条

[1] LTransformer: A Transformer-Based Framework for Task Offloading in Vehicular Edge Computing
Yang, Yichi
Yan, Ruibin
Gu, Yijun
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (18):
[2] Vision Transformer-based Real-Time Camouflaged Object Detection System at Edge
Putatunda, Rohan
Khan, Md Azim
Gangopadhyay, Aryya
Wang, Jianwu
Busart, Carl
Erbacher, Robert F.
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING, SMARTCOMP, 2023, : 90 - 97
[3] Vision Transformer-Based Tailing Detection in Videos
Lee, Jaewoo
Lee, Sungjun
Cho, Wonki
Siddiqui, Zahid Ali
Park, Unsang
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (24):
[4] Vision Transformer-Based Photovoltaic Prediction Model
Kang, Zaohui
Xue, Jizhong
Lai, Chun Sing
Wang, Yu
Yuan, Haoliang
Xu, Fangyuan
[J]. ENERGIES, 2023, 16 (12)
[5] Vision Transformer-based pilot pose estimation
Wu, Honglan
Liu, Hao
Sun, Youchao
[J]. Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2024, 50 (10): : 3100 - 3110
[6] Transformer-OPU: An FPGA-based Overlay Processor for Transformer Networks
Bai, Yueyin
Zhou, Hao
Zhao, Keqing
Chen, Jianli
Yu, Jun
Wang, Kun
[J]. 2023 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, FCCM, 2023, : 222 - 222
[7] Vision Transformer-based recognition of diabetic retinopathy grade
Wu, Jianfang
Hu, Ruo
Xiao, Zhenghong
Chen, Jiaxu
Liu, Jingwei
[J]. MEDICAL PHYSICS, 2021, 48 (12) : 7850 - 7863
[8] Strawberry disease identification with vision transformer-based models
Nguyen, Hai Thanh
Tran, Tri Dac
Nguyen, Thanh Tuong
Pham, Nhi Minh
Nguyen Ly, Phuc Hoang
Luong, Huong Hoang
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (29) : 73101 - 73126
[9] Vision Transformer-Based Emotion Detection in HCI for Enhanced Interaction
Soni, Jayesh
Prabakar, Nagarajan
Upadhyay, Himanshu
[J]. INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 76 - 86
[10] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
Liu, Jun
Guo, Haoran
He, Yile
Li, Huali
[J]. REMOTE SENSING, 2023, 15 (21)

← 1 2 3 4 5 →