Fine-Grained Ship Classification by Combining CNN and Swin Transformer

被引:23
|
作者
Huang, Liang [1 ]
Wang, Fengxiang [1 ]
Zhang, Yalun [2 ]
Xu, Qingxia [3 ]
机构
[1] Naval Univ Engn, Coll Elect Engn, Wuhan 430000, Peoples R China
[2] Naval Univ Engn, Inst Noise & Vibrat, Wuhan 430000, Peoples R China
[3] Natl Univ Def Technol, Coll Int Studies, Wuhan 430000, Peoples R China
关键词
image classification; ship detection; remote sensing images; self-attention; transformer; CNN;
D O I
10.3390/rs14133087
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but different equipment and superstructures. To extract features such as ship superstructures, this paper introduces transformer architecture with self-attention into ship classification and detection, and a CNN and Swin transformer model (CNN-Swin model) is proposed for ship image classification and detection. The main contributions of this study are as follows: (1) The proposed approach pays attention to different scale features in ship image classification and detection, introduces a transformer architecture with self-attention into ship classification and detection for the first time, and uses a parallel network of a CNN and a transformer to extract features of images. (2) To exploit the CNN's performance and avoid overfitting as much as possible, a multi-branch CNN-Block is designed and used to construct a CNN backbone with simplicity and accessibility to extract features. (3) The performance of the CNN-Swin model is validated on the open FGSC-23 dataset and a dataset containing typical military ship categories based on open-source images. The results show that the model achieved accuracies of 90.9% and 91.9% for the FGSC-23 dataset and the military ship dataset, respectively, outperforming the existing nine state-of-the-art approaches. (4) The good extraction effect on the ship features of the CNN-Swin model is validated as the backbone of the three state-of-the-art detection methods on the open datasets HRSC2016 and FAIR1M. The results show the great potential of the CNN-Swin backbone with self-attention in ship detection.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Fine-grained ship detection based on consistency criteria of hierarchical classification
    Zhang Zhengning
    Zhang Lin
    Wang Yue
    Li Yunfei
    Yang Yunchao
    CHINESE SPACE SCIENCE AND TECHNOLOGY, 2023, 43 (03) : 93 - 104
  • [22] Remote Sensing Image Harmonization Method for Fine-Grained Ship Classification
    Zhang, Jingpu
    Zhong, Ziyan
    Wei, Xingzhuo
    Wu, Xianyun
    Li, Yunsong
    REMOTE SENSING, 2024, 16 (12)
  • [23] Contrastive Learning for Fine-Grained Ship Classification in Remote Sensing Images
    Chen, Jianqi
    Chen, Keyan
    Chen, Hao
    Li, Wenyuan
    Zou, Zhengxia
    Shi, Zhenwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [24] Fine-Grained Visual Classification via Internal Ensemble Learning Transformer
    Xu, Qin
    Wang, Jiahui
    Jiang, Bo
    Luo, Bin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9015 - 9028
  • [25] Dual-Dependency Attention Transformer for Fine-Grained Visual Classification
    Cui, Shiyan
    Hui, Bin
    SENSORS, 2024, 24 (07)
  • [26] Leveraging Fine-Grained Labels to Regularize Fine-Grained Visual Classification
    Wu, Junfeng
    Yao, Li
    Liu, Bin
    Ding, Zheyuan
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON COMPUTER MODELING AND SIMULATION (ICCMS 2019) AND 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND APPLICATIONS (ICICA 2019), 2019, : 133 - 136
  • [27] An attention cut classification network for fine-grained ship classification in remote sensing images
    Song, Yixuan
    Song, Fei
    Jin, Lei
    Lei, Tao
    Liu, Gang
    Jiang, Ping
    Peng, Zhenming
    REMOTE SENSING LETTERS, 2022, 13 (04) : 418 - 427
  • [28] Gradient aggregation based fine-grained image retrieval: A unified viewpoint for CNN and Transformer
    Yu, Han
    Lu, Huibin
    Zhao, Min
    Li, Zhuoyi
    Gu, Guanghua
    PATTERN RECOGNITION, 2024, 149
  • [29] Nazr-CNN: Fine-Grained Classification of UAV Imagery for Damage Assessment
    Attari, Nazia
    Ofli, Ferda
    Awad, Mohammad
    Lucas, Ji
    Chawla, Sanjay
    2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 50 - 59
  • [30] Fine grained food image recognition based on swin transformer
    Xiao, Zhiyong
    Diao, Guang
    Deng, Zhaohong
    JOURNAL OF FOOD ENGINEERING, 2024, 380