Vision Transformer (ViT)-based Applications in Image Classification

被引：6

作者：

Huo, Yingzi ^{[1
]}

Jin, Kai ^{[2
]}

Cai, Jiahong ^{[1
]}

Xiong, Huixuan ^{[1
]}

Pang, Jiacheng ^{[1
]}

机构：

[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China

[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China

来源：

2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年

关键词：

CNN; image classification; token; vision transformer; Vision Reservoir;

D O I：

10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.

引用

页码：135 / 140

页数：6

共 50 条

[1] CLASSIFICATION OF INTRACRANIAL HEMORRHAGE BASED ON CT-SCAN IMAGE WITH VISION TRANSFORMER (VIT) METHOD
Faiz, Muhammad Nur
Badriyah, Tessy
Kusuma, Selvia Ferdiana
2024 INTERNATIONAL ELECTRONICS SYMPOSIUM, IES 2024, 2024, : 454 - 459
[2] MIL-ViT: A multiple instance vision transformer for fundus image classification
Bi, Qi
Sun, Xu
Yu, Shuang
Ma, Kai
Bian, Cheng
Ning, Munan
He, Nanjun
Huang, Yawen
Li, Yuexiang
Liu, Hanruo
Zheng, Yefeng
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 97
[3] ViT-DualAtt: An efficient pornographic image classification method based on Vision Transformer with dual attention
Cai, Zengyu
Xu, Liusen
Zhang, Jianwei
Feng, Yuan
Zhu, Liang
Liu, Fangmei
ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (12): : 6698 - 6716
[4] A new ECT image reconstruction algorithm based on Vision transformer (ViT)
Wu, Xin-Jie
Xu, Si-Kai
Liu, Yan-Dong
Liu, Shi-Xing
Yan, Hua
Wang, Yan
Gao, Ming-Yu
FLOW MEASUREMENT AND INSTRUMENTATION, 2024, 97
[5] SI-ViT: Shuffle instance-based Vision Transformer for pancreatic cancer ROSE image classification
Zhang, Tianyi
Feng, Youdan
Zhao, Yu
Lei, Yanli
Ying, Nan
Song, Fan
He, Yufang
Yan, Zhiling
Feng, Yunlu
Yang, Aiming
Zhang, Guanglei
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
[6] GNViT- An enhanced image-based groundnut pest classification using Vision Transformer (ViT) model
Venkatasaichandrakanth, P.
Iyapparaja, M.
PLOS ONE, 2024, 19 (03):
[7] A ViT Vision Transformer Model for Rose Leaf Disease Classification
Saini, Archana
Guleria, Kalpna
Sharma, Shagun
2024 2ND WORLD CONFERENCE ON COMMUNICATION & COMPUTING, WCONF 2024, 2024,
[8] CWC-MP-MC Image-based breast tumor classification using an optimized Vision Transformer (ViT)
Kabir, Shahriar Mahmud
Bhuiyan, Mohammed Imamul Hassan
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
[9] MAT-VIT:A Vision Transformer with MAE-Based Self-Supervised Auxiliary Task for Medical Image Classification
Han, Yufei
Chen, Haoyuan
Yao, Linwei
Li, Kuan
Yin, Jianping
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2040 - 2046
[10] The Application of Vision Transformer in Image Classification
He, Zhixuan
2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63

← 1 2 3 4 5 →