Vision transformer models for mobile/edge devices: a survey

被引：3

作者：

Lee, Seung Il ^{[1
]}

Koo, Kwanghyun ^{[1
]}

Lee, Jong Ho ^{[1
]}

Lee, Gilha ^{[1
]}

Jeong, Sangbeom ^{[1
]}

Seongjun, O. ^{[1
]}

Kim, Hyun ^{[1
]}

机构：

[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, 232 Gongneung Ro, Seoul 01811, South Korea

来源：

MULTIMEDIA SYSTEMS | 2024年 / 30卷 / 02期

基金：

新加坡国家研究基金会;

关键词：

Vision transformer; Mobile/edge devices; Survey; NEURAL-NETWORK;

D O I：

10.1007/s00530-024-01312-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapidly growing demand for high-performance deep learning vision models on mobile and edge devices, this paper emphasizes the importance of compact deep learning-based vision models that can provide high accuracy while maintaining a small model size. In particular, based on the success of transformer models in natural language processing and computer vision tasks, this paper offers a comprehensive examination of the latest research in redesigning the Vision Transformer (ViT) model into a compact architecture suitable for mobile/edge devices. The paper classifies compact ViT models into three major categories: (1) architecture and hierarchy restructuring, (2) encoder block enhancements, and (3) integrated approaches, and provides a detailed overview of each category. This paper also analyzes the contribution of each method to model performance and computational efficiency, providing a deeper understanding of how to efficiently implement ViT models on edge devices. As a result, this paper can offer new insights into the design and implementation of compact ViT models for researchers in this field and provide guidelines for optimizing the performance and improving the efficiency of deep learning vision models on edge devices.

引用

页数：18

共 50 条

[21] TPrune: Efficient Transformer Pruning for Mobile Devices
Mao, Jiachen
Yang, Huanrui
Li, Ang
Li, Hai
Chen, Yiran
ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS, 2021, 5 (03)
[22] A Survey of Applications of Vision Transformer and Its Variants
Wu, Chuang
He, Tingqin
PROCEEDINGS OF THE 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS 2024, 2024, : 21 - 25
[23] OFFLOADING FOR MOBILE DEVICES: A SURVEY
Olteanu, Alexandru-Corneliu
Tapus, Nicolae
UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2014, 76 (01): : 3 - 16
[24] A Survey on Security for Mobile Devices
La Polla, Mariantonietta
Martinelli, Fabio
Sgandurra, Daniele
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2013, 15 (01): : 446 - 471
[25] Offloading for mobile devices: A survey
Faculty of Automatic Control and Computers, University POLITEHNICA of Bucharest, Romania
UPB Sci. Bull. Ser. C Electr. Eng., 1 (3-16):
[26] FedViT: Federated continual learning of vision transformer at edge
Zuo, Xiaojiang
Luopan, Yaxin
Han, Rui
Zhang, Qinglong
Liu, Chi Harold
Wang, Guoren
Chen, Lydia Y.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 154 : 1 - 15
[27] ViTA: A Vision Transformer Inference Accelerator for Edge Applications
Nag, Shashank
Datta, Gourav
Kundu, Souvik
Chandrachoodan, Nitin
Beerel, Peter A.
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[28] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
Xu, Zhiyuan
Yang, Dejun
Yin, Chengxiang
Tang, Jian
Wang, Yanzhi
Xue, Guoliang
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
[29] Containerized Computer Vision Applications on Edge Devices
Alqaisi, Osamah I.
Tosun, Ali Saman
Korkmaz, Turgay
2023 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND COMMUNICATIONS, EDGE, 2023, : 1 - 11
[30] Vision for mobile robot navigation: A survey
DeSouza, GN
Kak, AC
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) : 237 - 267

← 1 2 3 4 5 →