Beyond Pixel-Level Annotation: Exploring Self-Supervised Learning for Change Detection With Image-Level Supervision

被引:3
|
作者
Zhao, Maofan [1 ,2 ,3 ]
Hu, Xinli [1 ,2 ,4 ]
Zhang, Linlin [1 ,2 ,4 ]
Meng, Qingyan [1 ,2 ,4 ]
Chen, Yuxing [3 ]
Bruzzone, Lorenzo [3 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Univ Trento, Dept Informat Engn & Comp Sci, I-38123 Trento, Italy
[4] Hainan Aerosp Informat Res Inst, Key Lab Earth Observat Hainan Prov, Sanya 572029, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Cams; Remote sensing; Feature extraction; Supervised learning; Training; Self-supervised learning; Change detection (CD); contrastive learning; equivariant regularization (ER); mutual learning; weakly supervised; SEGMENTATION; NETWORK;
D O I
10.1109/TGRS.2024.3379431
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Change detection (CD) in high-resolution remote sensing has received large attention due to its wide range of applications. Many methods have been proposed in the literature and achieved excellent performance. However, they are often fully supervised, thus requiring abundant pixel-level labeled samples, which is time-consuming and labor-intensive. Especially compared to the common single-temporal interpretation, labeling bi-temporal images is often more complicated. Therefore, this study combines weakly supervised learning (WSL) to reduce label acquisition costs. However, changed regions are small, fragmented, and similar to the background, which increases the gap between weakly supervised and fully supervised tasks. To address these difficulties, we explore self-supervised methods to construct a WSL framework based on image-level labels for general CD, termed WSLCD in this article. First, we design a double-branch Siamese network to derive embeddings and initial class attention maps (CAMs), which input the original image pair and the spatially transformed image pair. Second, mutual learning and equivariant regularization (MLER) are enforced on CAMs from different views, which implements consistency constraints in confusion regions and makes CAMs learn from each other based on saliency regions. Furthermore, prototype-based contrastive learning (PCL) is designed such that unreliable pixels can learn from prototypes computed from reliable pixels. PCL includes intraview contrast and cross-view contrast depending on whether the prototypes and class embeddings are from the same view. With the above strategies, we narrow the gap between image-level weakly supervised CD and fully supervised CD. Experiments are conducted on three CD datasets, including CLCD, DSIFN, and GCD. Our method achieves state-of-the-art performance on pseudo-label generation and CD. The code is available at https://github.com/mfzhao1998/WSLCD.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [1] Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation
    Ahn, Jiwoon
    Kwak, Suha
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4981 - 4990
  • [2] DEEP SELF-SUPERVISED PIXEL-LEVEL LEARNING FOR HYPERSPECTRAL CLASSIFICATION
    Gonzalez-Santiago, Jonathan
    Schenkel, Fabian
    Gross, Wolfgang
    Middelmann, Wolfgang
    [J]. 2022 12TH WORKSHOP ON HYPERSPECTRAL IMAGING AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2022,
  • [3] A Self-Supervised Approach to Pixel-Level Change Detection in Bi-Temporal RS Images
    Chen, Yuxing
    Bruzzone, Lorenzo
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [4] From Image-level to Pixel-level Labeling with Convolutional Networks
    Pinheiro, Pedro O.
    Collohert, Ronan
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1713 - 1721
  • [5] Learning to segment with image-level supervision
    Pandey, Gaurav
    Dukkipati, Ambedkar
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1856 - 1865
  • [6] HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining
    Tran, Van Nhiem
    Liu, Shen-Hsuan
    Huang, Chi-En
    Aslam, Muhammad Saqlain
    Yang, Kai-Lin
    Li, Yung-Hui
    Wang, Jia-Ching
    [J]. VISUAL COMPUTER, 2024, 40 (11): : 7945 - 7960
  • [7] Pixel-Level Self-Supervised Learning for Semi-Supervised Building Extraction From Remote Sensing Images
    Yu, Anzhu
    Liu, Bing
    Cao, Xuefeng
    Qiu, Chunping
    Guo, Wenyue
    Quan, Yujun
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [8] Pixel-Level Self-Supervised Learning for Semi-Supervised Building Extraction From Remote Sensing Images
    Yu, Anzhu
    Liu, Bing
    Cao, Xuefeng
    Qiu, Chunping
    Guo, Wenyue
    Quan, Yujun
    [J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19
  • [9] Sequence2Self: Self-supervised image sequence denoising of pixel-level spray breakup morphology
    Oh, Ji-Hun
    Wood, Eric
    Mayhew, Eric
    Kastengren, Alan
    Lee, Tonghun
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [10] Self-Supervised Monocular Depth Estimation With Geometric Prior and Pixel-Level Sensitivity
    Liu, Jierui
    Cao, Zhiqiang
    Liu, Xilong
    Wang, Shuo
    Yu, Junzhi
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (03): : 2244 - 2256