Remote Sensing Object Detection Based on Convolution and Swin Transformer

被引:14
|
作者
Jiang, Xuzhao [1 ]
Wu, Yonghong [1 ]
机构
[1] Wuhan Univ Technol, Dept Stat, Wuhan 430070, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Feature extraction; Transformers; Remote sensing; Prediction algorithms; Detection algorithms; Classification algorithms; Remote sensing images; object detection; attention mechanism; swin transformer; multi-scale features;
D O I
10.1109/ACCESS.2023.3267435
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Remote sensing object detection is an essential task for surveying the earth. It is challenging for the target detection algorithm in natural scenes to obtain satisfactory detection results in remote sensing images. In this paper, the RAST-YOLO (You only look once with Regin Attention and Swin Transformer) algorithm is proposed to address the problems of remote sensing object detection, such as significant differences in target scales, complex backgrounds, and tightly arranged small-size targets. To increase the information interaction range of the feature map, make full use of the background information of the object, and improve the detection accuracy of the object with a complex background, the Regin Attention (RA) mechanism combined with Swin Transformer as the backbone is proposed to extract features. To improve the detection accuracy of small objects, the C3D module is used to fuse deep and shallow semantic information and optimize the multi-scale problem of remote sensing targets. To evaluate the performance of RAST-YOLO, extensive experiments are performed on DIOR and TGRS-HRRSD datasets. The experimental results show that RAST achieves state-of-the-art detection accuracy with high efficiency and robustness. Specifically, compared with the baseline network, the mean average precision (mAP) of detection results is improved by 5% and 2.3% on DIOR and TGRS-HRRSD datasets, respectively, which demonstrates RAST-YOLO is effective and superior. Moreover, the lightweight structure of RAST-YOLO can ensure the real-time detection speed and obtain excellent detection results.
引用
收藏
页码:38643 / 38656
页数:14
相关论文
共 50 条
  • [31] Swin-RSIC: remote sensing image classification using a modified swin transformer with explainability
    Ansith S
    Ananth A
    Ebin Deni Raj
    Kala S
    Earth Science Informatics, 2025, 18 (2)
  • [32] SwinSOD: Salient object detection using swin-transformer
    Wu, Shuang
    Zhang, Guangjian
    Liu, Xuefeng
    IMAGE AND VISION COMPUTING, 2024, 146
  • [33] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
    He, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Zhang, Di
    Yao, Rui
    Xue, Yong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
    Fan, Lili
    Zhou, Yu
    Liu, Hongmei
    Li, Yunjie
    Cao, Dongpu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
  • [35] Efficient Swin Transformer for Remote Sensing Image Super-Resolution
    Kang, Xudong
    Duan, Puhong
    Li, Jier
    Li, Shutao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6367 - 6379
  • [36] ST-YOLOX: a lightweight and accurate object detection network based on Swin Transformer
    Jingjing Han
    Guangqi Yang
    Hongyang Wei
    Weijun Gong
    Yurong Qian
    The Journal of Supercomputing, 2024, 80 : 8038 - 8059
  • [37] ST-YOLOX: a lightweight and accurate object detection network based on Swin Transformer
    Han, Jingjing
    Yang, Guangqi
    Wei, Hongyang
    Gong, Weijun
    Qian, Yurong
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (06): : 8038 - 8059
  • [38] A Semantic Segmentation Method for Remote Sensing Images Based on the Swin Transformer Fusion Gabor Filter
    Feng, Dongdong
    Zhang, Zhihua
    Yan, Kun
    IEEE ACCESS, 2022, 10 : 77432 - 77451
  • [39] SEMANTIC SEGMENTATION FOR REMOTE SENSING IMAGES BASED ON SWIN-TRANSFORMER AND MULTISCALE FEATURE REFINEMENT
    Zhu, Shengyu
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6370 - 6373
  • [40] CSTSUNet: A Cross Swin Transformer-Based Siamese U-Shape Network for Change Detection in Remote Sensing Images
    Wu, Yaping
    Li, Lu
    Wang, Nan
    Li, Wei
    Fan, Junfang
    Tao, Ran
    Wen, Xin
    Wang, Yanfeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61