A Comprehensive Survey of Transformers for Computer Vision

被引:32
|
作者
Jamil, Sonain [1 ]
Piran, Md. Jalil [2 ]
Kwon, Oh-Jin [1 ]
机构
[1] Sejong Univ, Dept Elect Engn, Seoul 05006, South Korea
[2] Sejong Univ, Dept Comp Engn, Seoul 05006, South Korea
关键词
vision transformers; computer vision; deep learning; image coding; drone imagery; drone surveillance; ANOMALY DETECTION; CLASSIFICATION; IMAGES; CNN;
D O I
10.3390/drones7050287
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
As a special type of transformer, vision transformers (ViTs) can be used for various computer vision (CV) applications. Convolutional neural networks (CNNs) have several potential problems that can be resolved with ViTs. For image coding tasks such as compression, super-resolution, segmentation, and denoising, different variants of ViTs are used. In our survey, we determined the many CV applications to which ViTs are applicable. CV applications reviewed included image classification, object detection, image segmentation, image compression, image super-resolution, image denoising, anomaly detection, and drone imagery. We reviewed the state of the-art and compiled a list of available models and discussed the pros and cons of each model.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Masked Autoencoders in Computer Vision: A Comprehensive Survey
    Zhou, Zexian
    Liu, Xiaojing
    IEEE ACCESS, 2023, 11 : 113560 - 113579
  • [2] Transformers in Vision: A Survey
    Khan, Salman
    Naseer, Muzammal
    Hayat, Munawar
    Zamir, Syed Waqas
    Khan, Fahad Shahbaz
    Shah, Mubarak
    ACM COMPUTING SURVEYS, 2022, 54 (10S)
  • [3] Deep reinforcement learning in computer vision: a comprehensive survey
    Le, Ngan
    Rathour, Vidhiwar Singh
    Yamazaki, Kashu
    Luu, Khoa
    Savvides, Marios
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2733 - 2819
  • [4] Deep reinforcement learning in computer vision: a comprehensive survey
    Ngan Le
    Vidhiwar Singh Rathour
    Kashu Yamazaki
    Khoa Luu
    Marios Savvides
    Artificial Intelligence Review, 2022, 55 : 2733 - 2819
  • [5] Vision Transformers for Computer Go
    Sagri, Amani
    Cazenave, Tristan
    Arjonilla, Jerome
    Saffidine, Abdallah
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2024, PT II, 2024, 14635 : 376 - 388
  • [6] Computer vision based food grain classification: A comprehensive survey
    Velesaca, Henry O.
    Suarez, Patricia L.
    Mira, Raul
    Sappa, Angel D.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 187
  • [7] A Comprehensive Survey of Indoor Localization Methods Based on Computer Vision
    Morar, Anca
    Moldoveanu, Alin
    Mocanu, Irina
    Moldoveanu, Florica
    Radoi, Ion Emilian
    Asavei, Victor
    Gradinaru, Alexandru
    Butean, Alex
    SENSORS, 2020, 20 (09)
  • [8] Computer vision-based plants phenotyping: A comprehensive survey
    Meraj, Talha
    Sharif, Muhammad Imran
    Raza, Mudassar
    Alabrah, Amerah
    Kadry, Seifedine
    Gandomi, Amir H.
    ISCIENCE, 2024, 27 (01)
  • [9] Vision Transformers in Image Restoration: A Survey
    Ali, Anas M.
    Benjdira, Bilel
    Koubaa, Anis
    El-Shafai, Walid
    Khan, Zahid
    Boulila, Wadii
    SENSORS, 2023, 23 (05)
  • [10] Vision transformers for dense prediction: A survey
    Zuo, Shuangquan
    Xiao, Yun
    Chang, Xiaojun
    Wang, Xuanhong
    KNOWLEDGE-BASED SYSTEMS, 2022, 253