Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model

被引:2
|
作者
Wang, Fengxiang [1 ]
Yu, Deying [2 ]
Huang, Liang [3 ]
Zhang, Yalun [4 ]
Chen, Yongbing [2 ]
Wang, Zhiguo [5 ]
机构
[1] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha, Peoples R China
[2] Naval Univ Engn, Sch Elect Engn, Wuhan, Peoples R China
[3] Naval Univ Engn, Coll Elect Engn, Wuhan, Peoples R China
[4] Peoples Liberat Army Naval Command Coll, Combat Command Dept, Nanjing, Peoples R China
[5] Naval Univ Engn, Dept Operat Res & Planning, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; image classification; ship detection; remote-sensing images; transformer; REMOTE-SENSING IMAGES; NETWORK;
D O I
10.1080/10095020.2024.2331552
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
In naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model's superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Fine-grained Vision-based Vehicle Classification
    Zahn, K.
    Caduff, A.
    Hofstetter, J.
    Rechsteiner, M.
    Bucher, P.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 112 - 114
  • [32] Remote Sensing Image Harmonization Method for Fine-Grained Ship Classification
    Zhang, Jingpu
    Zhong, Ziyan
    Wei, Xingzhuo
    Wu, Xianyun
    Li, Yunsong
    REMOTE SENSING, 2024, 16 (12)
  • [33] Transformer-Based Few-Shot and Fine-Grained Image Classification Method
    Lu, Yan
    Wang, Yangping
    Wang, Wenrun
    Computer Engineering and Applications, 2023, 59 (23) : 219 - 227
  • [34] Multi-Model Fusion Fine-Grained Image Classification Method Based on Migration Learning
    Zhang, Wenying
    Wang, Yaping
    IEEE ACCESS, 2024, 12 : 31977 - 31987
  • [35] Multi-part Token Transformer with Dual Contrastive Learning for Fine-grained Image Classification
    Wang, Chuanming
    Fu, Huiyuan
    Ma, Huadong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7648 - 7656
  • [36] SFRSwin: A Shallow Significant Feature Retention Swin Transformer for Fine-Grained Image Classification of Wildlife Species
    Wang, Shuai
    Han, Yubing
    Song, Shouliang
    Zhu, Honglei
    Zhang, Li
    Dong, Anming
    Yu, Jiguo
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 232 - 243
  • [37] Target Detection Optimization Model Based On Fine-grained Feature Fusion
    Bao, Xianfu
    Qiang, Zanxia
    Bai, Guangyao
    Yang, Rui
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND INTELLIGENT CONTROL (IPIC 2021), 2021, 11928
  • [38] COFENET: CO-FEATURE NEURAL NETWORK MODEL FOR FINE-GRAINED IMAGE CLASSIFICATION
    Wang, Bor-Shiun
    Hsieh, Jun-Wei
    Hsieh, Yi-Kuan
    Chen, Ping-Yang
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3928 - 3932
  • [39] A STRONG VISION TRANSFORMER ADAPTER WITH ADAPTIVE THRESHOLDING FOR FINE-GRAINED BUILDING CLASSIFICATION
    Lu, Xiaoqiang
    Jiao, Licheng
    Liu, Qiong
    Li, Lingling
    Liu, Fang
    Liu, Xu
    Yang, Yuting
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 674 - 677
  • [40] A vision transformer for fine-grained classification by reducing noise and enhancing discriminative information
    Zhang, Zi-Chao
    Chen, Zhen-Duo
    Wang, Yongxin
    Luo, Xin
    Xu, Xin-Shun
    PATTERN RECOGNITION, 2024, 145