Enhancing the reliability of deep learning-based head and neck tumour segmentation using uncertainty estimation with multi-modal images

被引:1
|
作者
Ren, Jintao [1 ,2 ,3 ]
Teuwen, Jonas [4 ]
Nijkamp, Jasper [1 ,3 ]
Rasmussen, Mathis [1 ,2 ,3 ]
Gouw, Zeno [4 ]
Eriksen, Jesper Grau [2 ,3 ]
Sonke, Jan-Jakob [4 ]
Korreman, Stine [1 ,2 ,3 ]
机构
[1] Aarhus Univ Hosp, Danish Ctr Particle Therapy, Palle Juul Jensens Blvd 25, DK-8200 Aarhus N, Denmark
[2] Aarhus Univ Hosp, Dept Oncol, Palle Juul Jensens Blvd 25, DK-8200 Aarhus N, Denmark
[3] Aarhus Univ, Dept Clin Med, Palle Juul Jensens Blvd 25, DK-8200 Aarhus N, Denmark
[4] Netherlands Canc Inst, Dept Radiat Oncol, Plesmanlaan 121, NL-1066 CX Amsterdam, Netherlands
来源
PHYSICS IN MEDICINE AND BIOLOGY | 2024年 / 69卷 / 16期
关键词
uncertainty estimation; deep learning; radiotherapy; gross tumour volume; head and neck cancer; tumour segmentation; uncertainty quantification; QUANTIFICATION; OROPHARYNGEAL; DELINEATION; DAHANCA;
D O I
10.1088/1361-6560/ad682d
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective. Deep learning shows promise in autosegmentation of head and neck cancer (HNC) primary tumours (GTV-T) and nodal metastases (GTV-N). However, errors such as including non-tumour regions or missing nodal metastases still occur. Conventional methods often make overconfident predictions, compromising reliability. Incorporating uncertainty estimation, which provides calibrated confidence intervals can address this issue. Our aim was to investigate the efficacy of various uncertainty estimation methods in improving segmentation reliability. We evaluated their confidence levels in voxel predictions and ability to reveal potential segmentation errors. Approach. We retrospectively collected data from 567 HNC patients with diverse cancer sites and multi-modality images (CT, PET, T1-, and T2-weighted MRI) along with their clinical GTV-T/N delineations. Using the nnUNet 3D segmentation pipeline, we compared seven uncertainty estimation methods, evaluating them based on segmentation accuracy (Dice similarity coefficient, DSC), confidence calibration (Expected Calibration Error, ECE), and their ability to reveal segmentation errors (Uncertainty-Error overlap using DSC, UE-DSC). Main results. Evaluated on the hold-out test dataset (n = 97), the median DSC scores for GTV-T and GTV-N segmentation across all uncertainty estimation methods had a narrow range, from 0.73 to 0.76 and 0.78 to 0.80, respectively. In contrast, the median ECE exhibited a wider range, from 0.30 to 0.12 for GTV-T and 0.25 to 0.09 for GTV-N. Similarly, the median UE-DSC also ranged broadly, from 0.21 to 0.38 for GTV-T and 0.22 to 0.36 for GTV-N. A probabilistic network-PhiSeg method consistently demonstrated the best performance in terms of ECE and UE-DSC. Significance. Our study highlights the importance of uncertainty estimation in enhancing the reliability of deep learning for autosegmentation of HNC GTV. The results show that while segmentation accuracy can be similar across methods, their reliability, measured by calibration error and uncertainty-error overlap, varies significantly. Used with visualisation maps, these methods may effectively pinpoint uncertainties and potential errors at the voxel level.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Effective deep learning-based multi-modal retrieval
    Wang, Wei
    Yang, Xiaoyan
    Ooi, Beng Chin
    Zhang, Dongxiang
    Zhuang, Yueting
    VLDB JOURNAL, 2016, 25 (01): : 79 - 101
  • [2] Effective deep learning-based multi-modal retrieval
    Wei Wang
    Xiaoyan Yang
    Beng Chin Ooi
    Dongxiang Zhang
    Yueting Zhuang
    The VLDB Journal, 2016, 25 : 79 - 101
  • [3] EXPLORING UNCERTAINTY FOR CLINICAL ACCEPTABILITY IN HEAD AND NECK DEEP LEARNING-BASED OAR SEGMENTATION
    Cubero, L.
    Serrano, J.
    Castelli, J.
    de Crevoisier, R.
    Acosta, O.
    Pascau, J.
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [4] Optimal segmentation and fusion of multi-modal brain images using clustering based deep learning algorithm
    Vijendran A.S.
    Ramasamy K.
    Measurement: Sensors, 2023, 27
  • [5] OctopusNet: A Deep Learning Segmentation Network for Multi-modal Medical Images
    Chen, Yu
    Chen, Jiawei
    Wei, Dong
    Li, Yuexiang
    Zheng, Yefeng
    MULTISCALE MULTIMODAL MEDICAL IMAGING, MMMI 2019, 2020, 11977 : 17 - 25
  • [6] Deep Learning-Based Auto-Segmentation of OARs in Head and Neck CT Images
    Shen, Z.
    Garsa, A.
    Sun, S.
    Bai, N.
    Zhang, C.
    Shiu, A.
    Chang, E.
    Yang, W.
    MEDICAL PHYSICS, 2020, 47 (06) : E598 - E598
  • [7] Applying deep learning-based multi-modal for detection of coronavirus
    Rani, Geeta
    Oza, Meet Ganpatlal
    Dhaka, Vijaypal Singh
    Pradhan, Nitesh
    Verma, Sahil
    Rodrigues, Joel J. P. C.
    MULTIMEDIA SYSTEMS, 2022, 28 (04) : 1251 - 1262
  • [8] Applying deep learning-based multi-modal for detection of coronavirus
    Geeta Rani
    Meet Ganpatlal Oza
    Vijaypal Singh Dhaka
    Nitesh Pradhan
    Sahil Verma
    Joel J. P. C. Rodrigues
    Multimedia Systems, 2022, 28 : 1251 - 1262
  • [9] Learning-Based Confidence Estimation for Multi-modal Classifier Fusion
    Nadeem, Uzair
    Bennamoun, Mohammed
    Sohel, Ferdous
    Togneri, Roberto
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 299 - 312
  • [10] Multi-modal deep learning framework for head & neck cancer outcome prediction
    Diamant, Andre
    Chatterjee, Avishek
    Vallieres, Martin
    Shenouda, George
    Seuntjens, Jan
    MEDICAL PHYSICS, 2019, 46 (11) : 5372 - 5372