Vision Transformer (ViT)-based Applications in Image Classification

被引:6
|
作者
Huo, Yingzi [1 ]
Jin, Kai [2 ]
Cai, Jiahong [1 ]
Xiong, Huixuan [1 ]
Pang, Jiacheng [1 ]
机构
[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China
关键词
CNN; image classification; token; vision transformer; Vision Reservoir;
D O I
10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [1] CLASSIFICATION OF INTRACRANIAL HEMORRHAGE BASED ON CT-SCAN IMAGE WITH VISION TRANSFORMER (VIT) METHOD
    Faiz, Muhammad Nur
    Badriyah, Tessy
    Kusuma, Selvia Ferdiana
    2024 INTERNATIONAL ELECTRONICS SYMPOSIUM, IES 2024, 2024, : 454 - 459
  • [2] MIL-ViT: A multiple instance vision transformer for fundus image classification
    Bi, Qi
    Sun, Xu
    Yu, Shuang
    Ma, Kai
    Bian, Cheng
    Ning, Munan
    He, Nanjun
    Huang, Yawen
    Li, Yuexiang
    Liu, Hanruo
    Zheng, Yefeng
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 97
  • [3] ViT-DualAtt: An efficient pornographic image classification method based on Vision Transformer with dual attention
    Cai, Zengyu
    Xu, Liusen
    Zhang, Jianwei
    Feng, Yuan
    Zhu, Liang
    Liu, Fangmei
    ELECTRONIC RESEARCH ARCHIVE, 2024, 32 (12): : 6698 - 6716
  • [4] A new ECT image reconstruction algorithm based on Vision transformer (ViT)
    Wu, Xin-Jie
    Xu, Si-Kai
    Liu, Yan-Dong
    Liu, Shi-Xing
    Yan, Hua
    Wang, Yan
    Gao, Ming-Yu
    FLOW MEASUREMENT AND INSTRUMENTATION, 2024, 97
  • [5] SI-ViT: Shuffle instance-based Vision Transformer for pancreatic cancer ROSE image classification
    Zhang, Tianyi
    Feng, Youdan
    Zhao, Yu
    Lei, Yanli
    Ying, Nan
    Song, Fan
    He, Yufang
    Yan, Zhiling
    Feng, Yunlu
    Yang, Aiming
    Zhang, Guanglei
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 244
  • [6] GNViT- An enhanced image-based groundnut pest classification using Vision Transformer (ViT) model
    Venkatasaichandrakanth, P.
    Iyapparaja, M.
    PLOS ONE, 2024, 19 (03):
  • [7] A ViT Vision Transformer Model for Rose Leaf Disease Classification
    Saini, Archana
    Guleria, Kalpna
    Sharma, Shagun
    2024 2ND WORLD CONFERENCE ON COMMUNICATION & COMPUTING, WCONF 2024, 2024,
  • [8] CWC-MP-MC Image-based breast tumor classification using an optimized Vision Transformer (ViT)
    Kabir, Shahriar Mahmud
    Bhuiyan, Mohammed Imamul Hassan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
  • [9] MAT-VIT:A Vision Transformer with MAE-Based Self-Supervised Auxiliary Task for Medical Image Classification
    Han, Yufei
    Chen, Haoyuan
    Yao, Linwei
    Li, Kuan
    Yin, Jianping
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2040 - 2046
  • [10] The Application of Vision Transformer in Image Classification
    He, Zhixuan
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63