Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue

被引:38
|
作者
Shao, Shuwei [1 ]
Pei, Zhongcai [1 ,5 ]
Chen, Weihai [1 ,5 ]
Zhu, Wentao [2 ]
Wu, Xingming [1 ]
Sun, Dianmin [3 ]
Zhang, Baochang [4 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing, Peoples R China
[2] Kuaishou Technol, Seattle, WA USA
[3] Shandong Univ, Shandong First Med Univ & Shandong Acad Med Sci, Shandong Canc Hosp, Jinan, Peoples R China
[4] Beihang Univ, Inst Artificial Intelligence, Beijing, Peoples R China
[5] Beihang Univ, Hangzhou Innovat Inst, Hangzhou, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Self-supervised learning; Monocular depth estimation; Ego-motion; Appearance flow; Brightness calibration; SURGERY; RECONSTRUCTION; NAVIGATION; SLAM;
D O I
10.1016/j.media.2021.102338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/shuweiShao/AF-SfMLearner. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Semantic and Optical Flow Guided Self-supervised Monocular Depth and Ego-Motion Estimation
    Fang, Jiaojiao
    Liu, Guizhong
    [J]. IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 465 - 477
  • [2] Self-supervised monocular depth and ego-motion estimation for CT-bronchoscopy fusion
    Chang, Qi
    Higgins, William E.
    [J]. IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, MEDICAL IMAGING 2024, 2024, 12928
  • [3] Self-Supervised Attention Learning for Depth and Ego-motion Estimation
    Sadek, Assent
    Chidlovskii, Boris
    [J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10054 - 10060
  • [4] Self-supervised learning of monocular depth and ego-motion estimation for non-rigid scenes in wireless capsule endoscopy videos
    Liao, Chao
    Wang, Chengliang
    Wang, Peng
    Wu, Hao
    Wang, Hongqian
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 91
  • [5] Self-Supervised Depth and Ego-Motion Estimation for Monocular Thermal Video Using Multi-Spectral Consistency Loss
    Shin, Ukcheol
    Lee, Kyunghyun
    Lee, Seokju
    Kweon, In So
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 1103 - 1110
  • [6] Self-supervised monocular depth estimation for gastrointestinal endoscopy
    Liu, Yuying
    Zuo, Siyang
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 238
  • [7] WS-SfMLearner: Self-supervised Monocular Depth and Ego-motion Estimation on Surgical Videos with Unknown Camera Parameters
    Lou, Ange
    Noble, Jack
    [J]. IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, MEDICAL IMAGING 2024, 2024, 12928
  • [8] Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation
    Shen, Tianwei
    Luo, Zixin
    Zhou, Lei
    Deng, Hanyu
    Zhang, Runze
    Fang, Tian
    Quan, Long
    [J]. 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 6359 - 6365
  • [9] Two Stream Networks for Self-Supervised Ego-Motion Estimation
    Ambrus, Rares
    Guizilini, Vitor
    Li, Jie
    Pillai, Sudeep
    Gaidon, Adrien
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [10] Joint self-supervised learning of interest point, descriptor, depth, and ego-motion from monocular video
    Wang, Zhongyi
    Shen, Mengjiao
    Chen, Qijun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (32) : 77529 - 77547