Remote sensing image change detection based on CNN-Transformer structure

被引：1

作者：

Pan, Mengyang ^{[1
,2
]}

Yang, Hang ^{[1
]}

Fan, Xianghui ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Changchun 130033, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

来源：

CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS | 2024年 / 39卷 / 10期

关键词：

remote sensing images; change detection; convolutional neural network; transformer; atrous spatial pyramid pooling;

D O I：

10.37188/CJLCD.2024-0086

中图分类号：

O7 [晶体学];

学科分类号：

0702 ; 070205 ; 0703 ; 080501 ;

摘要：

Modern high-resolution remote sensing images have achieved remarkable results in change detection with the aid of convolutional neural network (CNN ). However, the limited receptive field of convolution operations leads to insufficient learning of global context and long-distance-distance spatial relationships. While visual Transformers effectively capture dependencies in remote features, their handling of details in image changes is insufficient, resulting in limited spatial localization capabilities and low computational efficiency. To address these issues, this paper proposes a multi-level cross-layer linear fusion end-to-end encoding-decoding hybrid CNN-Transformer change detection model based on dilated spatial pyramid pooling, combining the advantages of visual Transformers and CNN. Firstly, image features are extracted using Siamese CNN, refined through dilated pyramid pooling to better capture detailed feature information. Secondly, the extracted attributes are converted into visual words, and a Transformer encoder models the compact visual words, , feeding the learned context-rich labels back into visual space through a Transformer decoder to reinforce the original features. Thirdly, CNN features are fused with the features from Transformer encoding- decoding through skip connections, , facilitating the fusion of position and semantic information by connecting features of different resolutions through upsampling. Finally, a difference enhancement module generates difference feature maps containing rich change information. Comprehensive experiments conducted on four publicly accessible remote sensing datasets, including LEVIR, , CDD, DSIFN and WHUCD, confirm ,confirm the efficacy of the proposed approach. Compared with other cutting-edge techniques for detecting changes, , the model presented in this paper achieves superior classification performance, effectively addressing issues such as under-segmentation , over-segmentation and rough edge segmentation in change detection results.

引用

页码：1361 / 1379

页数：19

共 42 条

[1] Automatic radiometric normalization of multitemporal satellite imagery with the iteratively re-weighted MAD transformation
Canty, Morton J.
Nielsen, Allan A.
[J]. REMOTE SENSING OF ENVIRONMENT, 2008, 112 (03) : 1025 - 1036
[2] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chen, Chun-Fu
Fan, Quanfu
Panda, Rameswar
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 347 - 356
[3] Remote Sensing Image Change Detection With Transformers
Chen, Hao
Qi, Zipeng
Shi, Zhenwei
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[4] A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection
Chen, Hao
Shi, Zhenwei
[J]. REMOTE SENSING, 2020, 12 (10)
[5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[7] Daudt RC, 2018, IEEE IMAGE PROC, P4063, DOI 10.1109/ICIP.2018.8451652
[8] PCA-based land-use change detection and analysis using multitemporal and multisensor satellite data
Deng, J. S.
Wang, K.
Deng, Y. H.
Qi, G. J.
[J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2008, 29 (16) : 4823 - 4838
[9] [佃袁勇 Dian Yuanyong], 2014, [武汉大学学报. 信息科学版, Geomatics and Information Science of Wuhan University], V39, P906
[10] Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images
Ding, Lei
Lin, Dong
Lin, Shaofu
Zhang, Jing
Cui, Xiaojie
Wang, Yuebin
Tang, Hao
Bruzzone, Lorenzo
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

← 1 2 3 4 5 →