Fine-Grained Ship Classification by Combining CNN and Swin Transformer

被引:23
|
作者
Huang, Liang [1 ]
Wang, Fengxiang [1 ]
Zhang, Yalun [2 ]
Xu, Qingxia [3 ]
机构
[1] Naval Univ Engn, Coll Elect Engn, Wuhan 430000, Peoples R China
[2] Naval Univ Engn, Inst Noise & Vibrat, Wuhan 430000, Peoples R China
[3] Natl Univ Def Technol, Coll Int Studies, Wuhan 430000, Peoples R China
关键词
image classification; ship detection; remote sensing images; self-attention; transformer; CNN;
D O I
10.3390/rs14133087
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but different equipment and superstructures. To extract features such as ship superstructures, this paper introduces transformer architecture with self-attention into ship classification and detection, and a CNN and Swin transformer model (CNN-Swin model) is proposed for ship image classification and detection. The main contributions of this study are as follows: (1) The proposed approach pays attention to different scale features in ship image classification and detection, introduces a transformer architecture with self-attention into ship classification and detection for the first time, and uses a parallel network of a CNN and a transformer to extract features of images. (2) To exploit the CNN's performance and avoid overfitting as much as possible, a multi-branch CNN-Block is designed and used to construct a CNN backbone with simplicity and accessibility to extract features. (3) The performance of the CNN-Swin model is validated on the open FGSC-23 dataset and a dataset containing typical military ship categories based on open-source images. The results show that the model achieved accuracies of 90.9% and 91.9% for the FGSC-23 dataset and the military ship dataset, respectively, outperforming the existing nine state-of-the-art approaches. (4) The good extraction effect on the ship features of the CNN-Swin model is validated as the backbone of the three state-of-the-art detection methods on the open datasets HRSC2016 and FAIR1M. The results show the great potential of the CNN-Swin backbone with self-attention in ship detection.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] ASP-CNN: aligning semantic parts for fine-grained image classification
    Ge, Hao
    Tu, Xiaoguang
    Xie, Mei
    Ma, Zheng
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [32] Fine-Grained Classification of Remote Sensing Ship Images Based on Improved VAN
    Zhou, Guoqing
    Huang, Liang
    Sun, Qiao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 1985 - 2007
  • [33] Distribution Shift Metric Learning for Fine-Grained Ship Classification in SAR Images
    Xu, Yongjie
    Lang, Haitao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 2276 - 2285
  • [34] A Public Dataset for Fine-Grained Ship Classification in Optical Remote Sensing Images
    Di, Yanghua
    Jiang, Zhiguo
    Zhang, Haopeng
    REMOTE SENSING, 2021, 13 (04) : 1 - 12
  • [35] A Novel Multiscale Contrastive Learning Network for Fine-Grained Ocean Ship Classification
    Dong, Shaokang
    Feng, Jiangfan
    Fang, Dongxu
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 9989 - 10005
  • [36] A vision transformer for fine-grained classification by reducing noise and enhancing discriminative information
    Zhang, Zi-Chao
    Chen, Zhen-Duo
    Wang, Yongxin
    Luo, Xin
    Xu, Xin-Shun
    PATTERN RECOGNITION, 2024, 145
  • [37] A STRONG VISION TRANSFORMER ADAPTER WITH ADAPTIVE THRESHOLDING FOR FINE-GRAINED BUILDING CLASSIFICATION
    Lu, Xiaoqiang
    Jiao, Licheng
    Liu, Qiong
    Li, Lingling
    Liu, Fang
    Liu, Xu
    Yang, Yuting
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 674 - 677
  • [38] Fine-grained imbalanced leukocyte classification with global-local attention transformer
    Chen, Ben
    Qin, Feiwei
    Shao, Yanli
    Cao, Jin
    Peng, Yong
    Ge, Ruiquan
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [39] Convolutionally Enhanced Feature Fusion Visual Transformer for Fine-Grained Visual Classification
    Huang, Min
    Zhu, Saixing
    Wang, Zehua
    Qu, Shuanghong
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 447 - 452
  • [40] Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model
    Wang, Fengxiang
    Yu, Deying
    Huang, Liang
    Zhang, Yalun
    Chen, Yongbing
    Wang, Zhiguo
    GEO-SPATIAL INFORMATION SCIENCE, 2024,