A Comprehensive Survey of Transformers for Computer Vision

被引:32
|
作者
Jamil, Sonain [1 ]
Piran, Md. Jalil [2 ]
Kwon, Oh-Jin [1 ]
机构
[1] Sejong Univ, Dept Elect Engn, Seoul 05006, South Korea
[2] Sejong Univ, Dept Comp Engn, Seoul 05006, South Korea
关键词
vision transformers; computer vision; deep learning; image coding; drone imagery; drone surveillance; ANOMALY DETECTION; CLASSIFICATION; IMAGES; CNN;
D O I
10.3390/drones7050287
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
As a special type of transformer, vision transformers (ViTs) can be used for various computer vision (CV) applications. Convolutional neural networks (CNNs) have several potential problems that can be resolved with ViTs. For image coding tasks such as compression, super-resolution, segmentation, and denoising, different variants of ViTs are used. In our survey, we determined the many CV applications to which ViTs are applicable. CV applications reviewed included image classification, object detection, image segmentation, image compression, image super-resolution, image denoising, anomaly detection, and drone imagery. We reviewed the state of the-art and compiled a list of available models and discussed the pros and cons of each model.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] A Survey on Underwater Computer Vision
    Gonzalez-Sabbagh, Salma P.
    Robles-Kelly, Antonio
    ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [22] A comprehensive survey on computer vision based approaches for automatic identification of products in retail store
    Santra, Bikash
    Mukherjee, Dipti Prasad
    IMAGE AND VISION COMPUTING, 2019, 86 : 45 - 63
  • [23] A comprehensive survey on applications of transformers for deep learning tasks
    Islam, Saidul
    Elmekki, Hanae
    Elsebai, Ahmed
    Bentahar, Jamal
    Drawel, Nagat
    Rjoub, Gaith
    Pedrycz, Witold
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [24] Advancing eye disease detection: A comprehensive study on computer-aided diagnosis with vision transformers and SHAP explainability techniques
    Balaha, Hossam Magdy
    Hassan, Asmaa El-Sayed
    Ahmed, Rawan Ayman
    Balaha, Magdy Hassan
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2025, 45 (01) : 23 - 33
  • [25] Enhancing Computer Vision Performance: A Hybrid Deep Learning Approach with CNNs and Vision Transformers
    Sardar, Abha Singh
    Ranjan, Vivek
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II, 2024, 2010 : 591 - 602
  • [26] Computer Vision on X-Ray Data in Industrial Production and Security Applications: A Comprehensive Survey
    Rafiei, Mehdi
    Raitoharju, Jenni
    Iosifidis, Alexandros
    IEEE ACCESS, 2023, 11 : 2445 - 2477
  • [27] A survey of Optimal Transport for Computer Graphics and Computer Vision
    Bonneel, Nicolas
    Digne, Julie
    COMPUTER GRAPHICS FORUM, 2023, 42 (02) : 439 - 460
  • [28] A SURVEY OF SENSOR PLANNING IN COMPUTER VISION
    TARABANIS, KA
    ALLEN, PK
    TSAI, RY
    IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 1995, 11 (01): : 86 - 104
  • [29] Adversarial attacks in computer vision: a survey
    Li, Chao
    Wang, Handing
    Yao, Wen
    Jiang, Tingsong
    JOURNAL OF MEMBRANE COMPUTING, 2024, 6 (2) : 130 - 147
  • [30] Prompt learning in computer vision: a survey
    Lei, Yiming
    Li, Jingqi
    Li, Zilong
    Cao, Yuan
    Shan, Hongming
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2024, 25 (01) : 42 - 63