Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model

被引:2
|
作者
Wang, Fengxiang [1 ]
Yu, Deying [2 ]
Huang, Liang [3 ]
Zhang, Yalun [4 ]
Chen, Yongbing [2 ]
Wang, Zhiguo [5 ]
机构
[1] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha, Peoples R China
[2] Naval Univ Engn, Sch Elect Engn, Wuhan, Peoples R China
[3] Naval Univ Engn, Coll Elect Engn, Wuhan, Peoples R China
[4] Peoples Liberat Army Naval Command Coll, Combat Command Dept, Nanjing, Peoples R China
[5] Naval Univ Engn, Dept Operat Res & Planning, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; image classification; ship detection; remote-sensing images; transformer; REMOTE-SENSING IMAGES; NETWORK;
D O I
10.1080/10095020.2024.2331552
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
In naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model's superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Global-local feature learning for fine-grained food classification based on Swin Transformer
    Kim, Jun-Hwa
    Kim, Namho
    Won, Chee Sun
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [42] MFF-Trans: Multi-level Feature Fusion Transformer for Fine-Grained Visual Classification
    Hang, Qi
    Yan, Xuefeng
    Gong, Lina
    WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 220 - 234
  • [43] TransFGVC: transformer-based fine-grained visual classification
    Shen, Longfeng
    Hou, Bin
    Jian, Yulei
    Tu, Xisong
    Zhang, Yingjie
    Shuai, Lingying
    Ge, Fangzhen
    Chen, Debao
    VISUAL COMPUTER, 2025, 41 (04): : 2439 - 2459
  • [44] Convolutionally Enhanced Feature Fusion Visual Transformer for Fine-Grained Visual Classification
    Huang, Min
    Zhu, Saixing
    Wang, Zehua
    Qu, Shuanghong
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 447 - 452
  • [45] Token Adaptive Vision Transformer with Efficient Deployment for Fine-Grained Image Recognition
    Lee, Chonghan
    Brufau, Rita Brugarolas
    Ding, Ke
    Narayanan, Vijaykrishnan
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [46] Fine-Grained Few-Shot Image Classification Based on Feature Dual Reconstruction
    Liu, Shudong
    Zhong, Wenlong
    Guo, Furong
    Cong, Jia
    Gu, Boyu
    ELECTRONICS, 2024, 13 (14)
  • [47] Flower fine-grained image classification based on multilayered feature fusion and region of interest
    Yang W.
    Huai Y.
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2021, 42 (04): : 588 - 594
  • [48] Fine-grained image classification with factorized deep user click feature
    Tan, Min
    Zhou, Jian
    Peng, Zhiyou
    Yu, Jun
    Tang, Fang
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
  • [49] Transformer model-based multi-scale fine-grained identification and classification of regional traffic states
    Zhang, Jun
    Hu, Guangtong
    PeerJ Computer Science, 2024, 10 : 1 - 28
  • [50] Multi-feature fusion for fine-grained sketch-based image retrieval
    Zhu, Ming
    Zhao, Chen
    Wang, Nian
    Tang, Jun
    Yan, Pu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 82 (24) : 38067 - 38076