A Hybrid Algorithm with Swin Transformer and Convolution for Cloud Detection

被引:8
|
作者
Gong, Chengjuan [1 ,2 ]
Long, Tengfei [1 ]
Yin, Ranyu [1 ]
Jiao, Weili [1 ]
Wang, Guizhou [1 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst AIR, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Swin transformer; cloud detection; image segmentation; attention; convolution; LANDSAT; SHADOW;
D O I
10.3390/rs15215264
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Cloud detection is critical in remote sensing image processing, and convolutional neural networks (CNNs) have significantly advanced this field. However, traditional CNNs primarily focus on extracting local features, which can be challenging for cloud detection due to the variability in the size, shape, and boundaries of clouds. To address this limitation, we propose a hybrid Swin transformer-CNN cloud detection (STCCD) network that combines the strengths of both architectures. The STCCD network employs a novel dual-stream encoder that integrates Swin transformer and CNN blocks. Swin transformers can capture global context features more effectively than traditional CNNs, while CNNs excel at extracting local features. The two streams are fused via a fusion coupling module (FCM) to produce a richer representation of the input image. To further enhance the network's ability in extracting cloud features, we incorporate a feature fusion module based on the attention mechanism (FFMAM) and an aggregation multiscale feature module (AMSFM). The FFMAM selectively merges global and local features based on their importance, while the AMSFM aggregates feature maps from different spatial scales to obtain a more comprehensive representation of the cloud mask. We evaluated the STCCD network on three challenging cloud detection datasets (GF1-WHU, SPARCS, and AIR-CD), as well as the L8-Biome dataset to assess its generalization capability. The results show that the STCCD network outperformed other state-of-the-art methods on all datasets. Notably, the STCCD model, trained on only four bands (visible and near-infrared) of the GF1-WHU dataset, outperformed the official Landsat-8 Fmask algorithm in the L8-Biome dataset, which uses additional bands (shortwave infrared, cirrus, and thermal).
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Hybrid Swin Transformer-Based Classification of Gaze Target Regions
    Wu, Gongpu
    Wang, Changyuan
    Gao, Lina
    Xue, Jinna
    [J]. IEEE ACCESS, 2023, 11 : 132055 - 132067
  • [42] CoST-UNet: Convolution and swin transformer based deep learning architecture for cardiac segmentation
    Islam, Md Rabiul
    Qaraqe, Marwa
    Serpedin, Erchin
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96
  • [43] SwinMin: A mineral recognition model incorporating convolution and multi-scale contexts into swin transformer
    Jia, Liqin
    Chen, Feng
    Yang, Mei
    Meng, Fang
    He, Mingyue
    Liu, Hongmin
    [J]. COMPUTERS & GEOSCIENCES, 2024, 184
  • [44] A Study on Pine Larva Detection System Using Swin Transformer and Cascade R-CNN Hybrid Model
    Lee, Sang-Hyun
    Gao, Gao
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [45] Ship target instance segmentation algorithm based on improved Swin Transformer
    Qian K.
    Li C.
    Chen M.
    Guo J.
    Pan L.
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (10): : 3049 - 3057
  • [46] SparseSwin: Swin transformer with sparse transformer block
    Pinasthika, Krisna
    Laksono, Blessius Sheldo Putra
    Irsal, Riyandi Banovbi Putera
    Shabiyya, Syifa Hukma
    Yudistira, Novanto
    [J]. NEUROCOMPUTING, 2024, 580
  • [47] A data efficient transformer based on Swin Transformer
    Yao, Dazhi
    Shao, Yunxue
    [J]. VISUAL COMPUTER, 2024, 40 (04): : 2589 - 2598
  • [48] A data efficient transformer based on Swin Transformer
    Dazhi Yao
    Yunxue Shao
    [J]. The Visual Computer, 2024, 40 : 2589 - 2598
  • [49] Unifying transformer and convolution for dam crack detection
    Zhang, Erhu
    Shao, Linhao
    Wang, Yang
    [J]. AUTOMATION IN CONSTRUCTION, 2023, 147
  • [50] A Federated Convolution Transformer for Fake News Detection
    Djenouri, Youcef
    Belbachir, Ahmed Nabil
    Michalak, Tomasz
    Srivastava, Gautam
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (03) : 214 - 225