Self-Supervised Pretraining With Monocular Height Estimation for Semantic Segmentation

被引:0
|
作者
Xiong, Zhitong [1 ]
Chen, Sining [1 ]
Shi, Yilei [2 ]
Zhu, Xiao Xiang [1 ,3 ]
机构
[1] Tech Univ Munich TUM, Chair Data Sci Earth Observat, D-80333 Munich, Germany
[2] Tech Univ Munich TUM, Sch Engn & Design, D-80333 Munich, Germany
[3] Munich Ctr Machine Learning, Chair Data Sci Earth Observat, D-80333 Munich, Germany
关键词
Semantics; Task analysis; Estimation; Neurons; Semantic segmentation; Data models; Buildings; Foundation models; interpretable deep learning; monocular height estimation (MHE); self-supervised pretraining;
D O I
10.1109/TGRS.2024.3412629
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Monocular height estimation (MHE) is key for generating 3-D city models, essential for swift disaster response. Moving beyond the traditional focus on performance enhancement, our study breaks new ground by probing the interpretability of MHE networks. We have pioneeringly discovered that neurons within MHE models demonstrate selectivity for both height and semantic classes. This insight sheds light on the complex inner workings of MHE models and inspires innovative strategies for leveraging elevation data more effectively. Informed by this insight, we propose a pioneering framework that employs MHE as a self-supervised pretraining method for remote sensing (RS) imagery. This approach significantly enhances the performance of semantic segmentation tasks. Furthermore, we develop a disentangled latent transformer (DLT) module that leverages explainable deep representations from pretrained MHE networks for unsupervised semantic segmentation. Our method demonstrates the significant potential of MHE tasks in developing foundation models for sophisticated pixel-level semantic analyses. Additionally, we present a new dataset designed to benchmark the performance of both semantic segmentation and height estimation tasks. The dataset and code will be publicly available at https://github.com/zhu-xlab/DLT-MHE.pytorch.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Self-Supervised Monocular Depth Estimation With Extensive Pretraining
    Choi, Hyukdoo
    [J]. IEEE ACCESS, 2021, 9 : 157236 - 157246
  • [2] Self-Supervised Monocular Depth Estimation with Extensive Pretraining
    Choi, Hyukdoo
    [J]. IEEE Access, 2021, 9 : 157236 - 157246
  • [3] Self-Supervised Monocular Depth Estimation Method for Joint Semantic Segmentation
    Song, Xiaogang
    Hu, Haoyue
    Ning, Jingyu
    Liang, Li
    Lu, Xiaofeng
    Hei, Xinhong
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (05): : 1336 - 1347
  • [4] Bootstrapped Self-Supervised Training with Monocular Video for Semantic Segmentation and Depth Estimation
    Zhang, Yihao
    Leonard, John J.
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 2420 - 2427
  • [5] Self-Supervised Pretraining Improves Self-Supervised Pretraining
    Reed, Colorado J.
    Yue, Xiangyu
    Nrusimha, Ani
    Ebrahimi, Sayna
    Vijaykumar, Vivek
    Mao, Richard
    Li, Bo
    Zhang, Shanghang
    Guillory, Devin
    Metzger, Sean
    Keutzer, Kurt
    Darrell, Trevor
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1050 - 1060
  • [6] Graph semantic information for self-supervised monocular depth estimation
    Zhang, Dongdong
    Wang, Chunping
    Wang, Huiying
    Fu, Qiang
    [J]. PATTERN RECOGNITION, 2024, 156
  • [7] SurgNet: Self-Supervised Pretraining With Semantic Consistency for Vessel and Instrument Segmentation in Surgical Images
    Chen, Jiachen
    Li, Mengyang
    Han, Hu
    Zhao, Zhiming
    Chen, Xilin
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (04) : 1513 - 1525
  • [8] SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
    Kumar, Varun Ravi
    Klingner, Marvin
    Yogamani, Senthil
    Milz, Stefan
    Fingscheidt, Tim
    Maeder, Patrick
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 61 - 71
  • [9] Plugging Self-Supervised Monocular Depth into Unsupervised Domain Adaptation for Semantic Segmentation
    Cardace, Adriano
    De Luigi, Luca
    Ramirez, Pierluigi Zama
    Salti, Samuele
    Di Stefano, Luigi
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1999 - 2009
  • [10] ThreeWays to Improve Semantic Segmentation with Self-Supervised Depth Estimation
    Hoyer, Lukas
    Dai, Dengxin
    Chen, Yuhua
    Koring, Adrian
    Saha, Suman
    Van Gool, Luc
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11125 - 11135