A Comprehensive Survey of Transformers for Computer Vision

被引：32

作者：

Jamil, Sonain ^{[1
]}

Piran, Md. Jalil ^{[2
]}

Kwon, Oh-Jin ^{[1
]}

机构：

[1] Sejong Univ, Dept Elect Engn, Seoul 05006, South Korea

[2] Sejong Univ, Dept Comp Engn, Seoul 05006, South Korea

来源：

DRONES | 2023年 / 7卷 / 05期

关键词：

vision transformers; computer vision; deep learning; image coding; drone imagery; drone surveillance; ANOMALY DETECTION; CLASSIFICATION; IMAGES; CNN;

D O I：

10.3390/drones7050287

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

As a special type of transformer, vision transformers (ViTs) can be used for various computer vision (CV) applications. Convolutional neural networks (CNNs) have several potential problems that can be resolved with ViTs. For image coding tasks such as compression, super-resolution, segmentation, and denoising, different variants of ViTs are used. In our survey, we determined the many CV applications to which ViTs are applicable. CV applications reviewed included image classification, object detection, image segmentation, image compression, image super-resolution, image denoising, anomaly detection, and drone imagery. We reviewed the state of the-art and compiled a list of available models and discussed the pros and cons of each model.

引用

页数：27

共 50 条

[1] Masked Autoencoders in Computer Vision: A Comprehensive Survey
Zhou, Zexian
Liu, Xiaojing
IEEE ACCESS, 2023, 11 : 113560 - 113579
[2] Transformers in Vision: A Survey
Khan, Salman
Naseer, Muzammal
Hayat, Munawar
Zamir, Syed Waqas
Khan, Fahad Shahbaz
Shah, Mubarak
ACM COMPUTING SURVEYS, 2022, 54 (10S)
[3] Deep reinforcement learning in computer vision: a comprehensive survey
Le, Ngan
Rathour, Vidhiwar Singh
Yamazaki, Kashu
Luu, Khoa
Savvides, Marios
ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2733 - 2819
[4] Deep reinforcement learning in computer vision: a comprehensive survey
Ngan Le
Vidhiwar Singh Rathour
Kashu Yamazaki
Khoa Luu
Marios Savvides
Artificial Intelligence Review, 2022, 55 : 2733 - 2819
[5] Vision Transformers for Computer Go
Sagri, Amani
Cazenave, Tristan
Arjonilla, Jerome
Saffidine, Abdallah
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2024, PT II, 2024, 14635 : 376 - 388
[6] Computer vision based food grain classification: A comprehensive survey
Velesaca, Henry O.
Suarez, Patricia L.
Mira, Raul
Sappa, Angel D.
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 187
[7] A Comprehensive Survey of Indoor Localization Methods Based on Computer Vision
Morar, Anca
Moldoveanu, Alin
Mocanu, Irina
Moldoveanu, Florica
Radoi, Ion Emilian
Asavei, Victor
Gradinaru, Alexandru
Butean, Alex
SENSORS, 2020, 20 (09)
[8] Computer vision-based plants phenotyping: A comprehensive survey
Meraj, Talha
Sharif, Muhammad Imran
Raza, Mudassar
Alabrah, Amerah
Kadry, Seifedine
Gandomi, Amir H.
ISCIENCE, 2024, 27 (01)
[9] Vision Transformers in Image Restoration: A Survey
Ali, Anas M.
Benjdira, Bilel
Koubaa, Anis
El-Shafai, Walid
Khan, Zahid
Boulila, Wadii
SENSORS, 2023, 23 (05)
[10] Vision transformers for dense prediction: A survey
Zuo, Shuangquan
Xiao, Yun
Chang, Xiaojun
Wang, Xuanhong
KNOWLEDGE-BASED SYSTEMS, 2022, 253

← 1 2 3 4 5 →