Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

被引:0
|
作者
Chavarrias Solano, Pedro Esteban [1 ]
Bulpitt, Andrew [1 ]
Subramanian, Venkataraman [2 ,3 ]
Ali, Sharib [1 ]
机构
[1] School of Computer Science, Faculty of Engineering and Physical Sciences, University of Leeds, Leeds,LS2 9JT, United Kingdom
[2] Department of Gastroenterology, Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
[3] Division of Gastroenterology and Surgical Sciences Leeds Institute of Medical Research at St James's University of Leeds, Leeds, United Kingdom
关键词
Multi-task learning;
D O I
10.1016/j.media.2024.103379
中图分类号
学科分类号
摘要
Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on δ1.25 accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset. © 2024 The Authors
引用
收藏
相关论文
共 50 条
  • [31] Multi-task Representation Learning for Travel Time Estimation
    Li, Yaguang
    Fu, Kun
    Wang, Zheng
    Shahabi, Cyrus
    Ye, Jieping
    Liu, Yan
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1695 - 1704
  • [32] Multi-Task Rank Learning for Visual Saliency Estimation
    Li, Jia
    Tian, Yonghong
    Huang, Tiejun
    Gao, Wen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (05) : 623 - 636
  • [33] Cross-Task Crowdsourcing
    Mo, Kaixiang
    Zhong, Erheng
    Yang, Qiang
    19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 677 - 685
  • [34] Multi-task Sparse Gaussian Processes with Improved Multi-task Sparsity Regularization
    Zhu, Jiang
    Sun, Shiliang
    PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 54 - 62
  • [35] Cross-stitch Networks for Multi-task Learning
    Misra, Ishan
    Shrivastava, Abhinav
    Gupta, Abhinav
    Hebert, Martial
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3994 - 4003
  • [36] Multi-task Supervised Learning via Cross-learning
    Cervino, Juan
    Andres Bazerque, Juan
    Calvo-Fullana, Miguel
    Ribeiro, Alejandro
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1381 - 1385
  • [37] Semi-supervised Multi-task Learning for Semantics and Depth
    Wang, Yufeng
    Tsai, Yi-Hsuan
    Hung, Wei-Chih
    Ding, Wenrui
    Liu, Shuo
    Yang, Ming-Hsuan
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2663 - 2672
  • [38] SEQUENTIAL CROSS ATTENTION BASED MULTI-TASK LEARNING
    Kim, Sunkyung
    Choi, Hyesong
    Min, Dongbo
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2311 - 2315
  • [39] MULTI-TASK LEARNING WITH CROSS ATTENTION FOR KEYWORD SPOTTING
    Higuchil, Takuya
    Gupta, Anmol
    Dhir, Chandra
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 571 - 578
  • [40] Color–depth multi-task learning for object detection in haze
    Zhe Chen
    Xin Wang
    Tanghuai Fan
    Lizhong Xu
    Neural Computing and Applications, 2020, 32 : 6591 - 6599