Tuple Perturbation-Based Contrastive Learning Framework for Multimodal Remote Sensing Image Semantic Segmentation

被引：0

作者：

Ye, Yuanxin ^{[1
,2
]}

Dai, Jinkun ^{[1
,2
]}

Zhou, Liang ^{[1
,2
]}

Duan, Keyi ^{[1
,2
]}

Tao, Ran ^{[3
]}

Li, Wei ^{[3
]}

Hong, Danfeng ^{[4
,5
]}

机构：

[1] Southwest Jiaotong Univ, Fac Geosci & Engn, Chengdu 610031, Peoples R China

[2] Southwest Jiaotong Univ, State Prov Joint Engn Lab Spatial Informat Technol, Chengdu 611756, Peoples R China

[3] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China

[4] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[5] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2025年 / 63卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Contrastive learning; Remote sensing; Optical sensors; Optical imaging; Radar polarimetry; Adaptive optics; Training; Perturbation methods; multimodal remote sensing image (RSI); negative samples; semantic segmentation; tuple perturbation;

D O I：

10.1109/TGRS.2025.3542868

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Deep learning models exhibit promising potential in multimodal remote sensing image semantic segmentation (MRSISS). However, the constrained access to labeled samples for training deep learning networks significantly influences the performance of these models. To address that, self-supervised learning (SSL) methods have garnered significant interest in the remote sensing community. Accordingly, this article proposes a novel multimodal contrastive learning framework based on tuple perturbation, which includes the pretraining and fine-tuning stages. First, a tuple perturbation-based multimodal contrastive learning network (TMCNet) is designed to better explore shared and different feature representations across modalities during the pretraining stage and the tuple perturbation module is introduced to improve the network's ability to extract multimodal features by generating more complex negative samples. In the fine-tuning stage, we develop a simple and effective multimodal semantic segmentation network (MSSNet), which can reduce noise by using complementary information from various modalities to integrate multimodal features more effectively, resulting in better semantic segmentation performance. Extensive experiments have been carried out on two published multimodal image datasets including optical and synthetic aperture radar (SAR) pairs, and the results show that the proposed framework can obtain more superior performance of semantic segmentation than the current state-of-the-art methods in cases of limited labeled samples. The source code is available at https://github.com/yeyuanxin110/TMCNet-MSSNet.

引用

页数：15

共 50 条

[41] Multi-granularity semantic alignment distillation learning for remote sensing image semantic segmentation
Zhang, Di
Zhou, Yong
Zhao, Jiaqi
Yang, Zhongyuan
Dong, Hui
Yao, Rui
Ma, Huifang
FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (04)
[42] Multi-granularity semantic alignment distillation learning for remote sensing image semantic segmentation
ZHANG Di
ZHOU Yong
ZHAO Jiaqi
YANG Zhongyuan
DONG Hui
YAO Rui
MA Huifang
Frontiers of Computer Science, 2022, 16 (04)
[43] Multimodal Supervised Contrastive Learning in Remote Sensing Downstream Tasks
Berg, Paul
Uzun, Baki
Pham, Minh-Tan
Courty, Nicolas
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
[44] Deep Multimodal Fusion Network for Semantic Segmentation Using Remote Sensing Image and LiDAR Data
Sun, Yangjie
Fu, Zhongliang
Sun, Chuanxia
Hu, Yinglei
Zhang, Shengyuan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[45] A multiple instance learning based framework for semantic image segmentation
Gondra, Iker
Xu, Tao
MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 48 (02) : 339 - 365
[46] A multiple instance learning based framework for semantic image segmentation
Iker Gondra
Tao Xu
Multimedia Tools and Applications, 2010, 48 : 339 - 365
[47] Semantic Segmentation of Urban Remote Sensing Images Based on Deep Learning
Liu, Jingyi
Wu, Jiawei
Xie, Hongfei
Xiao, Dong
Ran, Mengying
APPLIED SCIENCES-BASEL, 2024, 14 (17):
[48] Semantic segmentation of remote sensing images based on deep learning methods
Huang, Cong
Yang, Yao
Wang, Huajun
Ma, Yu
Zhao, Jinquan
Wan, Jun
2021 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, INFORMATION AND COMMUNICATION ENGINEERING, 2021, 11933
[49] Learning to Adapt Adversarial Perturbation Consistency for Domain Adaptive Semantic Segmentation of Remote Sensing Images
Xi, Zhihao
Meng, Yu
Chen, Jingbo
Deng, Yupeng
Liu, Diyou
Kong, Yunlong
Yue, Anzhi
REMOTE SENSING, 2023, 15 (23)
[50] Advancing perturbation space expansion based on information fusion for semi-supervised remote sensing image semantic segmentation
Zhou, Liang
Duan, Keyi
Dai, Jinkun
Ye, Yuanxin
INFORMATION FUSION, 2025, 117

← 1 2 3 4 5 →