Masked representation learning for domain generalized stereo matching

被引:14
|
作者
Rao, Zhibo [1 ,2 ]
Xiong, Bangshu [1 ]
He, Mingyi [2 ]
Dai, Yuchao [2 ]
He, Renjie [2 ]
Shen, Zhelun [3 ]
Li, Xing [2 ]
机构
[1] Nanchang Hangkong Univ, Nanchang, Peoples R China
[2] Northwestern Polytech Univ, Xian, Peoples R China
[3] Baidu Res, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.00526
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, many deep stereo matching methods have begun to focus on cross-domain performance, achieving impressive achievements. However, these methods did not deal with the significant volatility of generalization performance among different training epochs. Inspired by masked representation learning and multi-task learning, this paper designs a simple and effective masked representation for domain generalized stereo matching. First, we feed the masked left and complete right images as input into the models. Then, we add a lightweight and simple decoder following the feature extraction module to recover the original left image. Finally, we train the models with two tasks (stereo matching and image reconstruction) as a pseudo-multi-task learning framework, promoting models to learn structure information and to improve generalization performance. We implement our method on two well-known architectures (CFNet and LacGwcNet) to demonstrate its effectiveness. Experimental results on multi-datasets show that: (1) our method can be easily plugged into the current various stereo matching models to improve generalization performance; (2) our method can reduce the significant volatility of generalization performance among different training epochs; (3) we find that the current methods prefer to choose the best results among different training epochs as generalization performance, but it is impossible to select the best performance by ground truth in practice.
引用
收藏
页码:5435 / 5444
页数:10
相关论文
共 50 条
  • [1] Learning Representations from Foundation Models for Domain Generalized Stereo Matching
    Zhang, Yongjian
    Wang, Longguang
    Li, Kunhong
    Wang, Yun
    Guo, Yulan
    COMPUTER VISION - ECCV 2024, PT XLII, 2025, 15100 : 146 - 162
  • [2] Cascaded recurrent networks with masked representation learning for stereo matching of high-resolution satellite images
    Rao, Zhibo
    Li, Xing
    Xiong, Bangshu
    Dai, Yuchao
    Shen, Zhelun
    Li, Hangbiao
    Lou, Yue
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 218 : 151 - 165
  • [3] Domain Generalized Stereo Matching via Hierarchical Visual Transformation
    Chang, Tianyu
    Yang, Xun
    Zhang, Tianzhu
    Wang, Meng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9559 - 9568
  • [4] CLIP4STEREO: REVISITING DOMAIN GENERALIZED STEREO MATCHING VIA CLIP
    Ma, Chihao
    Zeng, Pengcheng
    Zhai, Jucai
    Liu, Yang
    Zhao, Yong
    Wang, Xinan
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 106 - 110
  • [5] MFANet: Multi-feature Aggregation Network for Domain Generalized Stereo Matching
    Yang, Jinlong
    Wang, Gang
    Wu, Cheng
    Chen, Dong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XI, ICIC 2024, 2024, 14872 : 241 - 252
  • [6] Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective
    Zhang, Jiawei
    Wang, Xiang
    Bai, Xiao
    Wang, Chen
    Huang, Lei
    Chen, Yimin
    Gu, Lin
    Zhou, Jun
    Harada, Tatsuya
    Hancock, Edwin R.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12991 - 13001
  • [7] Learning adversarial point-wise domain alignment for stereo matching
    Zhang, Chenghao
    Meng, Gaofeng
    Xu, Richard Yi Da
    Xiang, Shiming
    Pan, Chunhong
    NEUROCOMPUTING, 2022, 491 : 564 - 574
  • [8] IMFA-Stereo: Domain Generalized Stereo Matching via Iterative Multimodal Feature Aggregation Cost Volume
    Wang, Gang
    Yang, Jinlong
    Wu, Cheng
    Chen, Dong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XII, ICIC 2024, 2024, 14873 : 118 - 130
  • [9] Edge Domain Adaptation for Stereo Matching
    Li, Xing
    Fan, Yangyu
    Guo, Zhe
    Duan, Yu
    Liu, Shiya
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (07): : 2970 - 2980
  • [10] MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling
    Ahn, Jihye
    Choi, Hyesong
    Kim, Soomin
    Min, Dongbo
    IEEE ACCESS, 2025, 13 : 8912 - 8923