Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue

被引:38
|
作者
Shao, Shuwei [1 ]
Pei, Zhongcai [1 ,5 ]
Chen, Weihai [1 ,5 ]
Zhu, Wentao [2 ]
Wu, Xingming [1 ]
Sun, Dianmin [3 ]
Zhang, Baochang [4 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing, Peoples R China
[2] Kuaishou Technol, Seattle, WA USA
[3] Shandong Univ, Shandong First Med Univ & Shandong Acad Med Sci, Shandong Canc Hosp, Jinan, Peoples R China
[4] Beihang Univ, Inst Artificial Intelligence, Beijing, Peoples R China
[5] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Self-supervised learning; Monocular depth estimation; Ego-motion; Appearance flow; Brightness calibration; SURGERY; RECONSTRUCTION; NAVIGATION; SLAM;
D O I
10.1016/j.media.2021.102338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/shuweiShao/AF-SfMLearner. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Self-Supervised Learning of Non-Rigid Residual Flow and Ego-Motion
    Tishchenko, Ivan
    Lombardi, Sandro
    Oswald, Martin R.
    Pollefeys, Marc
    [J]. 2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 150 - 159
  • [22] Self-supervised Monocular Pose and Depth Estimation for Wireless Capsule Endoscopy with Transformers
    Nazifi, Nahid
    Araujo, Helder
    Erabati, Gopi Krishna
    Tahri, Omar
    [J]. IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, MEDICAL IMAGING 2024, 2024, 12928
  • [23] Digging Into Self-Supervised Monocular Depth Estimation
    Godard, Clement
    Mac Aodha, Oisin
    Firman, Michael
    Brostow, Gabriel
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3827 - 3837
  • [24] On the uncertainty of self-supervised monocular depth estimation
    Poggi, Matteo
    Aleotti, Filippo
    Tosi, Fabio
    Mattoccia, Stefano
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3224 - 3234
  • [25] Revisiting Self-supervised Monocular Depth Estimation
    Kim, Ue-Hwan
    Lee, Gyeong-Min
    Kim, Jong-Hwan
    [J]. ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 336 - 350
  • [26] Self-supervised monocular depth estimation in fog
    Tao, Bo
    Hu, Jiaxin
    Jiang, Du
    Li, Gongfa
    Chen, Baojia
    Qian, Xinbo
    [J]. OPTICAL ENGINEERING, 2023, 62 (03)
  • [27] Self-Supervised Monocular Depth and Motion Learning in Dynamic Scenes: Semantic Prior to Rescue
    Lee, Seokju
    Rameau, Francois
    Im, Sunghoon
    Kweon, In So
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2265 - 2285
  • [28] Self-Supervised Monocular Depth and Motion Learning in Dynamic Scenes: Semantic Prior to Rescue
    Seokju Lee
    Francois Rameau
    Sunghoon Im
    In So Kweon
    [J]. International Journal of Computer Vision, 2022, 130 : 2265 - 2285
  • [29] RETHINKING TRAINING OBJECTIVE FOR SELF-SUPERVISED MONOCULAR DEPTH ESTIMATION: SEMANTIC CUES TO RESCUE
    Li, Keyao
    Li, Ge
    Li, Thomas
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3308 - 3312
  • [30] Maximizing Self-Supervision From Thermal Image for Effective Self-Supervised Learning of Depth and Ego-Motion
    Shin, Ukcheol
    Lee, Kyunghyun
    Lee, Byeong-Uk
    Kweon, In So
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 7771 - 7778