Remote Sensing Object Detection Based on Convolution and Swin Transformer

被引:14
|
作者
Jiang, Xuzhao [1 ]
Wu, Yonghong [1 ]
机构
[1] Wuhan Univ Technol, Dept Stat, Wuhan 430070, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Feature extraction; Transformers; Remote sensing; Prediction algorithms; Detection algorithms; Classification algorithms; Remote sensing images; object detection; attention mechanism; swin transformer; multi-scale features;
D O I
10.1109/ACCESS.2023.3267435
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Remote sensing object detection is an essential task for surveying the earth. It is challenging for the target detection algorithm in natural scenes to obtain satisfactory detection results in remote sensing images. In this paper, the RAST-YOLO (You only look once with Regin Attention and Swin Transformer) algorithm is proposed to address the problems of remote sensing object detection, such as significant differences in target scales, complex backgrounds, and tightly arranged small-size targets. To increase the information interaction range of the feature map, make full use of the background information of the object, and improve the detection accuracy of the object with a complex background, the Regin Attention (RA) mechanism combined with Swin Transformer as the backbone is proposed to extract features. To improve the detection accuracy of small objects, the C3D module is used to fuse deep and shallow semantic information and optimize the multi-scale problem of remote sensing targets. To evaluate the performance of RAST-YOLO, extensive experiments are performed on DIOR and TGRS-HRRSD datasets. The experimental results show that RAST achieves state-of-the-art detection accuracy with high efficiency and robustness. Specifically, compared with the baseline network, the mean average precision (mAP) of detection results is improved by 5% and 2.3% on DIOR and TGRS-HRRSD datasets, respectively, which demonstrates RAST-YOLO is effective and superior. Moreover, the lightweight structure of RAST-YOLO can ensure the real-time detection speed and obtain excellent detection results.
引用
收藏
页码:38643 / 38656
页数:14
相关论文
共 50 条
  • [21] MDCT: Multi-Kernel Dilated Convolution and Transformer for One-Stage Object Detection of Remote Sensing Images
    Chen, Juanjuan
    Hong, Hansheng
    Song, Bin
    Guo, Jie
    Chen, Chen
    Xu, Junjie
    REMOTE SENSING, 2023, 15 (02)
  • [22] HIERARCHICAL REGION BASED CONVOLUTION NEURAL NETWORK FOR MULTISCALE OBJECT DETECTION IN REMOTE SENSING IMAGES
    Li, Qingpeng
    Mou, Lichao
    Jiang, Kaiyu
    Liu, Qingjie
    Wang, Yunhong
    Zhu, Xiao Xiang
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 4355 - 4358
  • [23] Multi-scale Feature Fusion Object Detection Based on Swin Transformer
    Zhang, Ying
    Wu, Lin
    Deng, Huaxuan
    Hu, Jun
    Li, Xifan
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 1982 - 1987
  • [24] REMOTE SENSING IMAGES CHANGE DETECTION USING THE SIAMESE NETWORK COMBINED WITH PURE SWIN TRANSFORMER
    Song, Xu
    Tong, Xinyu
    Hajamydeen, Asif Iqbal
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2024, 2024 (04): : 241 - 252
  • [25] Road Extraction from Remote Sensing Imagery with Spatial Attention Based on Swin Transformer
    Zhu, Xianhong
    Huang, Xiaohui
    Cao, Weijia
    Yang, Xiaofei
    Zhou, Yunfei
    Wang, Shaokai
    REMOTE SENSING, 2024, 16 (07)
  • [26] LiteST-Net: A Hybrid Model of Lite Swin Transformer and Convolution for Building Extraction from Remote Sensing Image
    Yuan, Wei
    Zhang, Xiaobo
    Shi, Jibao
    Wang, Jin
    REMOTE SENSING, 2023, 15 (08)
  • [27] Swin Transformer Combined with Convolution Neural Network for Surface Defect Detection
    Li, Yinghao
    Xiang, Yihao
    Guo, Haogong
    Liu, Panpan
    Liu, Chengming
    MACHINES, 2022, 10 (11)
  • [28] Salient Object Detection in Optical Remote Sensing Images Driven by Transformer
    Li, Gongyang
    Bai, Zhen
    Liu, Zhi
    Zhang, Xinpeng
    Ling, Haibin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5257 - 5269
  • [29] Transformer with Transfer CNN for Remote-Sensing-Image Object Detection
    Li, Qingyun
    Chen, Yushi
    Zeng, Ying
    REMOTE SENSING, 2022, 14 (04)
  • [30] ER-Swin: Feature Enhancement and Refinement Network Based on Swin Transformer for Semantic Segmentation of Remote Sensing Images
    Liu, Jiang
    Cheng, Shuli
    Du, Anyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5