Vision transformer models for mobile/edge devices: a survey

被引:3
|
作者
Lee, Seung Il [1 ]
Koo, Kwanghyun [1 ]
Lee, Jong Ho [1 ]
Lee, Gilha [1 ]
Jeong, Sangbeom [1 ]
Seongjun, O. [1 ]
Kim, Hyun [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, 232 Gongneung Ro, Seoul 01811, South Korea
基金
新加坡国家研究基金会;
关键词
Vision transformer; Mobile/edge devices; Survey; NEURAL-NETWORK;
D O I
10.1007/s00530-024-01312-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapidly growing demand for high-performance deep learning vision models on mobile and edge devices, this paper emphasizes the importance of compact deep learning-based vision models that can provide high accuracy while maintaining a small model size. In particular, based on the success of transformer models in natural language processing and computer vision tasks, this paper offers a comprehensive examination of the latest research in redesigning the Vision Transformer (ViT) model into a compact architecture suitable for mobile/edge devices. The paper classifies compact ViT models into three major categories: (1) architecture and hierarchy restructuring, (2) encoder block enhancements, and (3) integrated approaches, and provides a detailed overview of each category. This paper also analyzes the contribution of each method to model performance and computational efficiency, providing a deeper understanding of how to efficiently implement ViT models on edge devices. As a result, this paper can offer new insights into the design and implementation of compact ViT models for researchers in this field and provide guidelines for optimizing the performance and improving the efficiency of deep learning vision models on edge devices.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] TPrune: Efficient Transformer Pruning for Mobile Devices
    Mao, Jiachen
    Yang, Huanrui
    Li, Ang
    Li, Hai
    Chen, Yiran
    ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS, 2021, 5 (03)
  • [22] A Survey of Applications of Vision Transformer and Its Variants
    Wu, Chuang
    He, Tingqin
    PROCEEDINGS OF THE 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS 2024, 2024, : 21 - 25
  • [23] OFFLOADING FOR MOBILE DEVICES: A SURVEY
    Olteanu, Alexandru-Corneliu
    Tapus, Nicolae
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2014, 76 (01): : 3 - 16
  • [24] A Survey on Security for Mobile Devices
    La Polla, Mariantonietta
    Martinelli, Fabio
    Sgandurra, Daniele
    IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2013, 15 (01): : 446 - 471
  • [25] Offloading for mobile devices: A survey
    Faculty of Automatic Control and Computers, University POLITEHNICA of Bucharest, Romania
    UPB Sci. Bull. Ser. C Electr. Eng., 1 (3-16):
  • [26] FedViT: Federated continual learning of vision transformer at edge
    Zuo, Xiaojiang
    Luopan, Yaxin
    Han, Rui
    Zhang, Qinglong
    Liu, Chi Harold
    Wang, Guoren
    Chen, Lydia Y.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 154 : 1 - 15
  • [27] ViTA: A Vision Transformer Inference Accelerator for Edge Applications
    Nag, Shashank
    Datta, Gourav
    Kundu, Souvik
    Chandrachoodan, Nitin
    Beerel, Peter A.
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [28] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
    Xu, Zhiyuan
    Yang, Dejun
    Yin, Chengxiang
    Tang, Jian
    Wang, Yanzhi
    Xue, Guoliang
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
  • [29] Containerized Computer Vision Applications on Edge Devices
    Alqaisi, Osamah I.
    Tosun, Ali Saman
    Korkmaz, Turgay
    2023 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND COMMUNICATIONS, EDGE, 2023, : 1 - 11
  • [30] Vision for mobile robot navigation: A survey
    DeSouza, GN
    Kak, AC
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) : 237 - 267