Vision Transformer (ViT)-based Applications in Image Classification

被引:6
|
作者
Huo, Yingzi [1 ]
Jin, Kai [2 ]
Cai, Jiahong [1 ]
Xiong, Huixuan [1 ]
Pang, Jiacheng [1 ]
机构
[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China
来源
2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年
关键词
CNN; image classification; token; vision transformer; Vision Reservoir;
D O I
10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 50 条
  • [21] Network Intrusion Detection Based on Feature Image and Deformable Vision Transformer Classification
    He, Kan
    Zhang, Wei
    Zong, Xuejun
    Lian, Lian
    IEEE ACCESS, 2024, 12 : 44335 - 44350
  • [22] ViT-UNet: A Vision Transformer Based UNet Model for Coastal Wetland Classification Based on High Spatial Resolution Imagery
    Zhou, Nan
    Xu, Mingming
    Shen, Biaoqun
    Hou, Ke
    Liu, Shanwei
    Sheng, Hui
    Liu, Yanfen
    Wan, Jianhua
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 19575 - 19587
  • [23] Performance Comparison of Vision Transformer-Based Models in Medical Image Classification
    Kanca, Elif
    Ayas, Selen
    Kablan, Elif Baykal
    Ekinci, Murat
    2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2023,
  • [24] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
    Zhou, Heng
    Zhang, Xin
    Zhang, Chunlei
    Ma, Qiaoyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [25] Hyperspectral image classification with embedded linear vision transformer
    Tan, Yunfei
    Li, Ming
    Yuan, Longfa
    Shi, Chaoshan
    Luo, Yonghang
    Wen, Guihao
    EARTH SCIENCE INFORMATICS, 2025, 18 (01)
  • [26] CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification
    He, Wenxuan
    Huang, Weiliang
    Liao, Shuhong
    Xu, Zhen
    Yan, Jingwen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9266 - 9277
  • [27] Image Classification Using Vision Transformer for EtC Images
    Hamano, Genki
    Imaizumi, Shoko
    Kiya, Hitoshi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1506 - 1513
  • [28] CrisisViT: A Robust Vision Transformer for Crisis Image Classification
    Long, Zijun
    McCreadie, Richard
    Imran, Muhammad
    Proceedings of the International ISCRAM Conference, 2023, 2023-text : 309 - 319
  • [29] Compressed-Domain Vision Transformer for Image Classification
    Ji, Ruolei
    Karam, Lina J.
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 299 - 310
  • [30] HYBRID VISION TRANSFORMER MODEL FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Yang, Jiaqi
    Du, Bo
    Wu, Chen
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1388 - 1391