Multi-class motion-based semantic segmentation for ureteroscopy and laser

被引:3
|
作者
Gupta, Soumya [1 ,2 ]
Ali, Sharib [1 ,2 ,3 ,6 ]
Goldsmith, Louise [5 ]
Turney, Ben [5 ]
Rittscher, Jens [1 ,2 ,3 ,4 ]
机构
[1] Univ Oxford, Inst Biomed Engn IBME, Dept Engn Sci, Oxford, England
[2] Univ Oxford, Big Data Inst, Li Ka Shing Ctr Hlth Informat & Discovery, Oxford, England
[3] Univ Oxford, Oxford NIHR Biomed Res Ctr, Oxford, England
[4] Univ Oxford, Ludwig Inst Canc Res, Nuffield Dept Clin Med, Oxford, England
[5] Oxford Univ Hosp NHS Trust, Dept Urol, Oxford, England
[6] Univ Leeds, Sch Comp, Leeds, England
基金
英国惠康基金;
关键词
Ureteroscopy; Laser lithotripsy; Kidney stone; Semantic segmentation; U-net; DVFNet; Deep learning; DEEP LEARNING FRAMEWORK; MANAGEMENT;
D O I
10.1016/j.compmedimag.2022.102112
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Ureteroscopy with laser lithotripsy has evolved as the most commonly used technique for the treatment of kidney stones. Automated segmentation of kidney stones and the laser fiber is an essential initial step to performing any automated quantitative analysis, particularly stone-size estimation, that can be used by the surgeon to decide if the stone requires further fragmentation. However, factors such as turbid fluid inside the cavity, specularities, motion blur due to kidney movements and camera motion, bleeding, and stone debris impact the quality of vision within the kidney, leading to extended operative times. To the best of our knowledge, this is the first attempt made towards multi-class segmentation in ureteroscopy and laser lithotripsy data. We propose an end-to-end convolution neural network (CNN) based learning framework for the segmentation of stones and laser fiber. The proposed approach utilizes two sub-networks: (I) HybResUNet, a hybrid version of residual U-Net, that uses residual connections in the encoder path of the U-Net to improve semantic predictions, and (II) a DVFNet that generates deformation vector field (DVF) predictions by leveraging motion differences between the adjacent video frames which is then used to prune the prediction maps. We also present ablation studies that combine different dilated convolutions, recurrent and residual connections, atrous spatial pyramid pooling, and attention gate models. Further, we propose a compound loss function that significantly boosts the segmentation performance in our data. We have also provided an ablation study to determine the optimal data augmentation strategy for our dataset. Our qualitative and quantitative results illustrate that our proposed method outperforms state-of-the-art methods such as UNet and DeepLabv3+ showing a DSC improvement of 4.15% and 13.34%, respectively, in our in vivo test dataset. We further show that our proposed model outperforms state-of-the-art methods on an unseen out-of-sample clinical dataset with a DSC improvement of 9.61%, 11%, and 5.24% over UNet, HybResUNet, and DeepLabv3+, respectively in the case of the stone class and an improvement of 31.79%, 22.15%, and 10.42% over UNet, HybResUNet, and DeepLabv3+, respectively, in case of the laser class.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] MULTI-CLASS SEMANTIC SEGMENTATION OF FACES
    Khan, Khalil
    Mauro, Massimo
    Leonardi, Riccardo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 827 - 831
  • [2] Segmentation-based multi-class semantic object detection
    Vieux, Remi
    Benois-Pineau, Jenny
    Domenger, Jean-Philippe
    Braquelaire, Achille
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 60 (02) : 305 - 326
  • [3] Segmentation-based multi-class semantic object detection
    Remi Vieux
    Jenny Benois-Pineau
    Jean-Philippe Domenger
    Achille Braquelaire
    [J]. Multimedia Tools and Applications, 2012, 60 : 305 - 326
  • [4] A Combined Method for Multi-class Image Semantic Segmentation
    Gao, Chao
    Zhang, Xin
    Wang, Hui
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2012, 58 (02) : 596 - 604
  • [5] Multi-class semantic segmentation of pediatric chest radiographs
    Holste, Gregory
    Sullivan, Ryan P.
    Bindschadler, Michael
    Nagy, Nicholas
    Alessio, Adam
    [J]. MEDICAL IMAGING 2020: IMAGE PROCESSING, 2021, 11313
  • [6] Multi-class Semantic Video Segmentation with Exemplar-based Object Reasoning
    Liu, Buyu
    He, Xuming
    Gould, Stephen
    [J]. 2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, : 1014 - 1021
  • [7] Multi-Class Lane Semantic Segmentation of Expressway Dataset Based on Aerial View
    Fan, Yongnian
    Wang, Zhiguang
    Chen, Cheng
    Zhang, Xue
    Lu, Qiang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 200 - 211
  • [8] Multi-class indoor semantic segmentation with deep structured model
    Zheng, Chuanxia
    Wang, Jianhua
    Chen, Weihai
    Wu, Xingming
    [J]. VISUAL COMPUTER, 2018, 34 (05): : 735 - 747
  • [9] Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
    Xu, Lian
    Ouyang, Wanli
    Bennamoun, Mohammed
    Boussaid, Farid
    Xu, Dan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4300 - 4309
  • [10] Multi-class semantic segmentation for identification of silicate island defects
    Ramachandran, Vishwath
    Elias, Susan
    Narayanan, Badri
    Thilagam, Ayyappan Uma Chandra
    Sridharann, Niyanth
    [J]. WELDING INTERNATIONAL, 2023, 37 (01) : 12 - 20