Vision Transformer (ViT)-based Applications in Image Classification

被引:6
|
作者
Huo, Yingzi [1 ]
Jin, Kai [2 ]
Cai, Jiahong [1 ]
Xiong, Huixuan [1 ]
Pang, Jiacheng [1 ]
机构
[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China
关键词
CNN; image classification; token; vision transformer; Vision Reservoir;
D O I
10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [41] A-ViT: Adaptive Tokens for Efficient Vision Transformer
    Yin, Hongxu
    Vahdat, Arash
    Alvarez, Jose M.
    Mallya, Arun
    Kautz, Jan
    Molchanov, Pavlo
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10799 - 10808
  • [42] Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer
    Yuan Yuan
    Chen Minghui
    Ke Shuting
    Wang Teng
    He Longxi
    Lu Linjie
    Sun Hao
    Liu Jiannan
    CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2022, 49 (20):
  • [43] Image Classification of Tree Species in Relatives Based on Dual-Branch Vision Transformer
    Wang, Qi
    Dong, Yanqi
    Xu, Nuo
    Xu, Fu
    Mou, Chao
    Chen, Feixiang
    FORESTS, 2024, 15 (12):
  • [44] Hyperspectral Image Classification Based on Multi-stage Vision Transformer with Stacked Samples
    Chen, Xiaoyue
    Kamata, Sei-Ichiro
    Zhou, Weilian
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 441 - 446
  • [45] Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels
    Gul, Ahmet Gokberk
    Cetin, Oezdemir
    Reich, Christoph
    Flinner, Nadine
    Prangemeier, Tim
    Koeppl, Heinz
    MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
  • [46] Malware Family Classification Based on Vision Transformer
    Li, Jing
    Luo, Xueping
    Journal of Computers (Taiwan), 2023, 34 (01) : 87 - 99
  • [47] Privacy-Preserving Image Classification Using Vision Transformer
    Qi, Zheng
    MaungMaung, AprilPyone
    Kinoshita, Yuma
    Kiya, Hitoshi
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
  • [48] Survey of Vision Transformer in Fine-Grained Image Classification
    Sun, Lulu
    Liu, Jianping
    Wang, Jian
    Xing, Jialu
    Zhang, Yue
    Wang, Chenyang
    Computer Engineering and Applications, 60 (10): : 30 - 46
  • [49] Vision Transformer with window sequence merging mechanism for image classification
    Jiao, Erjie
    Leng, Qiangkui
    Guo, Jiamei
    Meng, Xiangfu
    Wang, Changzhong
    APPLIED SOFT COMPUTING, 2025, 171
  • [50] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
    Shiri, Mohammad
    Reddy, Monalika Padma
    Sun, Jiangwen
    2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301