Vision transformer models for mobile/edge devices: a survey

被引:3
|
作者
Lee, Seung Il [1 ]
Koo, Kwanghyun [1 ]
Lee, Jong Ho [1 ]
Lee, Gilha [1 ]
Jeong, Sangbeom [1 ]
Seongjun, O. [1 ]
Kim, Hyun [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, 232 Gongneung Ro, Seoul 01811, South Korea
基金
新加坡国家研究基金会;
关键词
Vision transformer; Mobile/edge devices; Survey; NEURAL-NETWORK;
D O I
10.1007/s00530-024-01312-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapidly growing demand for high-performance deep learning vision models on mobile and edge devices, this paper emphasizes the importance of compact deep learning-based vision models that can provide high accuracy while maintaining a small model size. In particular, based on the success of transformer models in natural language processing and computer vision tasks, this paper offers a comprehensive examination of the latest research in redesigning the Vision Transformer (ViT) model into a compact architecture suitable for mobile/edge devices. The paper classifies compact ViT models into three major categories: (1) architecture and hierarchy restructuring, (2) encoder block enhancements, and (3) integrated approaches, and provides a detailed overview of each category. This paper also analyzes the contribution of each method to model performance and computational efficiency, providing a deeper understanding of how to efficiently implement ViT models on edge devices. As a result, this paper can offer new insights into the design and implementation of compact ViT models for researchers in this field and provide guidelines for optimizing the performance and improving the efficiency of deep learning vision models on edge devices.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Investigating Bidimensional Downsampling in Vision Transformer Models
    Bruno, Paolo
    Amoroso, Roberto
    Cornia, Marcella
    Cascianelli, Silvia
    Baraldi, Lorenzo
    Cucchiara, Rita
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 287 - 299
  • [32] Diffusion Models in Vision: A Survey
    Croitoru, Florinel-Alin
    Hondru, Vlad
    Ionescu, Radu Tudor
    Shah, Mubarak
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10850 - 10869
  • [33] A Survey and Design of a Scalable Mobile Edge Cloud Platform for the Smart IoT Devices and It's Applications
    Cho, Yeongpil
    Paek, Yunheung
    Ahmed, Ejaz
    Ko, Kwangman
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 694 - 698
  • [34] A Survey of Hierarchical Energy Optimization for Mobile Edge Computing: A Perspective from End Devices to the Cloud
    Cong, Peijin
    Zhou, Junlong
    Li, Liying
    Cao, Kun
    Wei, Tongquan
    Li, Keqin
    ACM COMPUTING SURVEYS, 2020, 53 (02)
  • [35] A Survey on Caching in Mobile Edge Computing
    Zhao, Yuhan
    Zhang, Wei
    Zhou, Longquan
    Cao, Wenpeng
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [36] A survey on mobile edge platform with blockchain
    Zhu, Yujin
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 879 - 883
  • [37] Colorimetric Characterization of Mobile Devices for Vision Applications
    de Fez, Dolores
    Jose Luque, Maria
    Carmen Garcia-Domene, Maria
    Camps, Vicente
    Pinero, David
    OPTOMETRY AND VISION SCIENCE, 2016, 93 (01) : 85 - 93
  • [38] Remote Access: A Vision for Mobile Medical Devices
    Ernst, H.
    INTERNATIONAL JOURNAL OF ONLINE ENGINEERING, 2005, 1 (02)
  • [39] Vision Transformer-based overlay processor for Edge Computing
    Liu, Fang
    Fan, Zimeng
    Hu, Wei
    Xu, Dian
    Peng, Min
    He, Jing
    He, Yanxiang
    APPLIED SOFT COMPUTING, 2024, 156
  • [40] A survey on HDR visualization on mobile devices
    Magalhaes, Luis
    Bessa, Maximino
    Urbano, Carlos
    Melo, Miguel
    Peres, Emanuel
    Chalmers, Alan
    OPTICS, PHOTONICS, AND DIGITAL TECHNOLOGIES FOR MULTIMEDIA APPLICATIONS II, 2012, 8436