Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model

被引:2
|
作者
Wang, Fengxiang [1 ]
Yu, Deying [2 ]
Huang, Liang [3 ]
Zhang, Yalun [4 ]
Chen, Yongbing [2 ]
Wang, Zhiguo [5 ]
机构
[1] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha, Peoples R China
[2] Naval Univ Engn, Sch Elect Engn, Wuhan, Peoples R China
[3] Naval Univ Engn, Coll Elect Engn, Wuhan, Peoples R China
[4] Peoples Liberat Army Naval Command Coll, Combat Command Dept, Nanjing, Peoples R China
[5] Naval Univ Engn, Dept Operat Res & Planning, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; image classification; ship detection; remote-sensing images; transformer; REMOTE-SENSING IMAGES; NETWORK;
D O I
10.1080/10095020.2024.2331552
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
In naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model's superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Survey of Vision Transformer in Fine-Grained Image Classification
    Sun, Lulu
    Liu, Jianping
    Wang, Jian
    Xing, Jialu
    Zhang, Yue
    Wang, Chenyang
    Computer Engineering and Applications, 60 (10): : 30 - 46
  • [2] Multi-Scale Feature Transformer Based Fine-Grained Image Classification Method
    Zhang T.
    Cai C.
    Luo X.
    Zhu Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (04): : 70 - 75
  • [3] Fine-grained bird image classification based on counterfactual method of vision transformer model
    Tianhua Chen
    Yanyue Li
    Qinghua Qiao
    The Journal of Supercomputing, 2024, 80 : 6221 - 6239
  • [4] Fine-grained bird image classification based on counterfactual method of vision transformer model
    Chen, Tianhua
    Li, Yanyue
    Qiao, Qinghua
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (05): : 6221 - 6239
  • [5] Fine-Grained Image Classification Model Based on Improved Transformer
    Tian Zhansheng
    Liu Libo
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (02)
  • [6] FPT: Fine-Grained Detection of Driver Distraction Based on the Feature Pyramid Vision Transformer
    Wang, HaiTao
    Chen, Jie
    Huang, ZhiXiang
    Li, Bing
    Lv, JianMing
    Xi, JingMin
    Wu, BoCai
    Zhang, Jun
    Wu, ZhongCheng
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (02) : 1594 - 1608
  • [7] Ship fine-grained classification network based on multi-scale feature fusion
    Chen, Lisu
    Wang, Qian
    Zhu, Enyan
    Feng, Daolun
    Wu, Huafeng
    Liu, Tao
    OCEAN ENGINEERING, 2025, 318
  • [8] Fine-Grained Image Classification Based on Multi-Scale Feature Fusion
    Li Siyao
    Liu Yuhong
    Zhang Rongfen
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (12)
  • [9] Fine-Grained Ship Classification by Combining CNN and Swin Transformer
    Huang, Liang
    Wang, Fengxiang
    Zhang, Yalun
    Xu, Qingxia
    REMOTE SENSING, 2022, 14 (13)
  • [10] Hierarchical attention vision transformer for fine-grained visual classification
    Hu, Xiaobin
    Zhu, Shining
    Peng, Taile
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 91