Vehicle Classification Algorithm Based on Improved Vision Transformer

被引:1
|
作者
Dong, Xinlong [1 ]
Shi, Peicheng [1 ]
Tang, Yueyue [1 ]
Yang, Li [1 ]
Yang, Aixi [2 ]
Liang, Taonian [3 ]
机构
[1] Anhui Polytech Univ, Sch Mech & Automot Engn, Wuhu 241000, Peoples R China
[2] Zhejiang Univ, Polytech Inst, Hangzhou 310015, Peoples R China
[3] Chery New Energy Automobile Co Ltd, Wuhu 241000, Peoples R China
来源
WORLD ELECTRIC VEHICLE JOURNAL | 2024年 / 15卷 / 08期
关键词
vehicle classification; vision transformer; local detail features; sparse attention module; contrast loss;
D O I
10.3390/wevj15080344
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Vehicle classification technology is one of the foundations in the field of automatic driving. With the development of deep learning technology, visual transformer structures based on attention mechanisms can represent global information quickly and effectively. However, due to direct image segmentation, local feature details and information will be lost. To solve this problem, we propose an improved vision transformer vehicle classification network (IND-ViT). Specifically, we first design a CNN-In D branch module to extract local features before image segmentation to make up for the loss of detail information in the vision transformer. Then, in order to solve the problem of misdetection caused by the large similarity of some vehicles, we propose a sparse attention module, which can screen out the discernible regions in the image and further improve the detailed feature representation ability of the model. Finally, this paper uses the contrast loss function to further increase the intra-class consistency and inter-class difference of classification features and improve the accuracy of vehicle classification recognition. Experimental results show that the accuracy of the proposed model on the datasets of vehicle classification BIT-Vehicles, CIFAR-10, Oxford Flower-102, and Caltech-101 is higher than that of the original vision transformer model. Respectively, it increased by 1.3%, 1.21%, 7.54%, and 3.60%; at the same time, it also met a certain real-time requirement to achieve a balance of accuracy and real time.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Vision transformer based classification of sewer defects weighted loss model
    Ji, Chunhou
    Xie, Zhiqiang
    Li, Rong
    Yang, Zhibing
    Hou, Zhiqun
    TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2025, 156
  • [42] A deep fusion-based vision transformer for breast cancer classification
    Fiaz, Ahsan
    Raza, Basit
    Faheem, Muhammad
    Raza, Aadil
    HEALTHCARE TECHNOLOGY LETTERS, 2024, 11 (06) : 471 - 484
  • [43] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    REMOTE SENSING, 2023, 15 (21)
  • [44] Remote Sensing Scene Classification Based on Local Selection Vision Transformer
    Yang Kai
    Lu Xiaoqiang
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (22)
  • [45] Land Cover Classification for Polarimetric SAR Images Based on Vision Transformer
    Wang, Hongmiao
    Xing, Cheng
    Yin, Junjun
    Yang, Jian
    REMOTE SENSING, 2022, 14 (18)
  • [46] A deep fusion-based vision transformer for breast cancer classification
    Fiaz, Ahsan
    Raza, Basit
    Faheem, Muhammad
    Raza, Aadil
    HEALTHCARE TECHNOLOGY LETTERS, 2024,
  • [47] Encrypted traffic classification based on fusion of vision transformer and temporal features
    Wang L.
    Hu W.
    Liu J.
    Pang J.
    Gao Y.
    Xue J.
    Zhang J.
    Journal of China Universities of Posts and Telecommunications, 2023, 30 (02): : 73 - 82
  • [48] Air Quality Classification and Measurement Based on Double Output Vision Transformer
    Wang, Zhenyu
    Yang, Yingdong
    Yue, Shaolong
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21): : 20975 - 20984
  • [49] Fine-grained Vision-based Vehicle Classification
    Zahn, K.
    Caduff, A.
    Hofstetter, J.
    Rechsteiner, M.
    Bucher, P.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 112 - 114
  • [50] VISION-BASED APPROACH FOR URBAN VEHICLE DETECTION & CLASSIFICATION
    Long Hoang Pham
    Tin Trung Duong
    Ha Manh Tran
    Synh Viet-Uyen Ha
    2013 THIRD WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES (WICT), 2013, : 305 - 310