Vision Transformer (ViT)-based Applications in Image Classification

被引：6

作者：

Huo, Yingzi ^{[1
]}

Jin, Kai ^{[2
]}

Cai, Jiahong ^{[1
]}

Xiong, Huixuan ^{[1
]}

Pang, Jiacheng ^{[1
]}

机构：

[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China

[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China

来源：

2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年

关键词：

CNN; image classification; token; vision transformer; Vision Reservoir;

D O I：

10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.

引用

页码：135 / 140

页数：6

共 50 条

[21] Network Intrusion Detection Based on Feature Image and Deformable Vision Transformer Classification
He, Kan
Zhang, Wei
Zong, Xuejun
Lian, Lian
IEEE ACCESS, 2024, 12 : 44335 - 44350
[22] ViT-UNet: A Vision Transformer Based UNet Model for Coastal Wetland Classification Based on High Spatial Resolution Imagery
Zhou, Nan
Xu, Mingming
Shen, Biaoqun
Hou, Ke
Liu, Shanwei
Sheng, Hui
Liu, Yanfen
Wan, Jianhua
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 19575 - 19587
[23] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
Kanca, Elif
Ayas, Selen
Kablan, Elif Baykal
Ekinci, Murat
2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
[24] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
Zhou, Heng
Zhang, Xin
Zhang, Chunlei
Ma, Qiaoyu
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[25] Hyperspectral image classification with embedded linear vision transformer
Tan, Yunfei
Li, Ming
Yuan, Longfa
Shi, Chaoshan
Luo, Yonghang
Wen, Guihao
EARTH SCIENCE INFORMATICS, 2025, 18 (01)
[26] CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification
He, Wenxuan
Huang, Weiliang
Liao, Shuhong
Xu, Zhen
Yan, Jingwen
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9266 - 9277
[27] Image Classification Using Vision Transformer for EtC Images
Hamano, Genki
Imaizumi, Shoko
Kiya, Hitoshi
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1506 - 1513
[28] CrisisViT: A Robust Vision Transformer for Crisis Image Classification
Long, Zijun
McCreadie, Richard
Imran, Muhammad
Proceedings of the International ISCRAM Conference, 2023, 2023-text : 309 - 319
[29] Compressed-Domain Vision Transformer for Image Classification
Ji, Ruolei
Karam, Lina J.
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 299 - 310
[30] HYBRID VISION TRANSFORMER MODEL FOR HYPERSPECTRAL IMAGE CLASSIFICATION
Yang, Jiaqi
Du, Bo
Wu, Chen
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1388 - 1391

← 1 2 3 4 5 →