Switchable-Encoder-Based Self-Supervised Learning Framework for Monocular Depth and Pose Estimation

被引:0
|
作者
Kim, Junoh [1 ]
Gao, Rui [1 ]
Park, Jisun [1 ]
Yoon, Jinsoo [2 ]
Cho, Kyungeun [3 ]
机构
[1] Dongguk Univ Seoul, Dept Multimedia Engn, 30 Pildongro 1 Gil, Seoul 04620, South Korea
[2] KoROAD Korea Rd Traff Author, Autonomous Driving Res Dept, 2 Hyeoksin Ro, Wonu Si 26466, Gangwon Do, South Korea
[3] Dongguk Univ Seoul, Div AI Software Convergence, 30,Pildongro 1 Gil, Seoul 04620, South Korea
关键词
structure from motion; self-supervised learning; monocular depth estimation; VISUAL ODOMETRY; DEEP;
D O I
10.3390/rs15245739
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Monocular depth prediction research is essential for expanding meaning from 2D to 3D. Recent studies have focused on the application of a newly proposed encoder; however, the development within the self-supervised learning framework remains unexplored, an aspect critical for advancing foundational models of 3D semantic interpretation. Addressing the dynamic nature of encoder-based research, especially in performance evaluations for feature extraction and pre-trained models, this research proposes the switchable encoder learning framework (SELF). SELF enhances versatility by enabling the seamless integration of diverse encoders in a self-supervised learning context for depth prediction. This integration is realized through the direct transfer of feature information from the encoder and by standardizing the input structure of the decoder to accommodate various encoder architectures. Furthermore, the framework is extended and incorporated into an adaptable decoder for depth prediction and camera pose learning, employing standard loss functions. Comparative experiments with previous frameworks using the same encoder reveal that SELF achieves a 7% reduction in parameters while enhancing performance. Remarkably, substituting newly proposed algorithms in place of an encoder improves the outcomes as well as significantly decreases the number of parameters by 23%. The experimental findings highlight the ability of SELF to broaden depth factors, such as depth consistency. This framework facilitates the objective selection of algorithms as a backbone for extended research in monocular depth prediction.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Self-Supervised Monocular Depth Estimation Using Hybrid Transformer Encoder
    Hwang, Seung-Jun
    Park, Sung-Jun
    Baek, Joong-Hwan
    Kim, Byungkyu
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (19) : 18762 - 18770
  • [2] Self-Supervised Learning of Monocular Depth Estimation Based on Progressive Strategy
    Wang, Huachun
    Sang, Xinzhu
    Chen, Duo
    Wang, Peng
    Yan, Binbin
    Qi, Shuai
    Ye, Xiaoqian
    Yao, Tong
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2021, 7 : 375 - 383
  • [3] Depth estimation algorithm of monocular image based on self-supervised learning
    Bai L.
    Liu L.-J.
    Li X.-A.
    Wu S.
    Liu R.-Q.
    [J]. Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2023, 53 (04): : 1139 - 1145
  • [4] A LIGHTWEIGHT SELF-SUPERVISED TRAINING FRAMEWORK FOR MONOCULAR DEPTH ESTIMATION
    Heydrich, Tim
    Yang, Yimin
    Du, Shan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2265 - 2269
  • [5] A Dual Encoder-Decoder Network for Self-Supervised Monocular Depth Estimation
    Zheng, Mingkui
    Luo, Lin
    Zheng, Haifeng
    Ye, Zhangfan
    Su, Zhe
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (17) : 19747 - 19756
  • [6] Self-supervised monocular image depth learning and confidence estimation
    Chen, Long
    Tang, Wen
    Wan, Tao Ruan
    John, Nigel W.
    [J]. NEUROCOMPUTING, 2020, 381 : 272 - 281
  • [7] Self-supervised Learning for Dense Depth Estimation in Monocular Endoscopy
    Liu, Xingtong
    Sinha, Ayushi
    Unberath, Mathias
    Ishii, Masaru
    Hager, Gregory D.
    Taylor, Russell H.
    Reiter, Austin
    [J]. OR 2.0 CONTEXT-AWARE OPERATING THEATERS, COMPUTER ASSISTED ROBOTIC ENDOSCOPY, CLINICAL IMAGE-BASED PROCEDURES, AND SKIN IMAGE ANALYSIS, OR 2.0 2018, 2018, 11041 : 128 - 138
  • [8] Exploiting Pseudo Labels in a Self-Supervised Learning Framework for Improved Monocular Depth Estimation
    Petrovai, Andra
    Nedevschi, Sergiu
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1568 - 1578
  • [9] Self-supervised Monocular Pose and Depth Estimation for Wireless Capsule Endoscopy with Transformers
    Nazifi, Nahid
    Araujo, Helder
    Erabati, Gopi Krishna
    Tahri, Omar
    [J]. IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, MEDICAL IMAGING 2024, 2024, 12928
  • [10] Self-Supervised Monocular Depth Estimation With Isometric-Self-Sample-Based Learning
    Cha, Geonho
    Jang, Ho-Deok
    Wee, Dongyoon
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 2173 - 2180