Multi-dimension unified Swin Transformer for 3D Lesion Segmentation in Multiple Anatomical Locations

被引:0
|
作者
Pan, Shaoyan [1 ]
Liu, Yiqiao [2 ]
Halek, Sarah [2 ]
Tomaszewski, Michal [2 ]
Wang, Shubing [2 ]
Baumgartner, Richard [2 ]
Yuan, Jianda [2 ]
Goldmacher, Gregory [2 ]
Chen, Antong [2 ]
机构
[1] Emory Univ, Dept Biomed Informat, Atlanta, GA 30322 USA
[2] Merck & Co Inc, Rahway, NJ USA
来源
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI | 2023年
关键词
Lesion segmentation; pre-training; Swin transformer;
D O I
10.1109/ISBI53787.2023.10230562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In oncology research, accurate 3D segmentation of lesions from CT scans is essential for the extraction of 3D radiomics features in lesions and the modeling of lesion growth kinetics. However, following the RECIST criteria, radiologists routinely only delineate each lesion on the axial slice showing the largest transverse area, and occasionally delineate a small number of lesions in 3D for research purposes. As a result, to train models to segment the lesions automatically, we normally have plenty of unlabeled 3D volumes, an adequate number of labeled 2D images, and scarce labeled 3D volumes, which makes training a 3D segmentation model a challenging task. In this work, we propose a novel U-shaped deep learning model, denoted a multi-dimension unified Swin transformer (MDU-ST), to automatically conduct 3D lesion segmentation. The MDU-ST consists of a Shifted-window transformer (Swin-transformer) encoder and a convolutional neural network (CNN) decoder, allowing it to adapt to 2D and 3D inputs and learn the corresponding semantic information from various inputs in the same encoder. Based on this model, we introduce a three-stage framework to train the model effectively: 1) leveraging large amount of unlabeled 3D lesion volumes through multiple self-supervised pretext tasks to learn the underlying pattern of lesion anatomy in the Swin-transformer encoder; 2) fine-tune the Swin-transformer encoder to perform 2D lesion segmentation with 2D RECIST slices to learn slice-level segmentation information; 3) further fine-tune the Swin-transformer encoder to perform 3D lesion segmentation with labeled 3D volumes to learn volume-level segmentation information. We compare the proposed MDU-ST with state-of-the-art CNN-based and transformer-based segmentation models using an internal lesion dataset with 593 lesions extracted from multiple anatomical locations and delineated in 3D. The network's performance is evaluated by the Dice similarity coefficient (DSC) for volume-based accuracy and Hausdorff distance (HD) for surface-based accuracy. The average DSC achieved by the MDU-ST with proposed pipeline is 0.78; HD is 5.55 mm. The proposed MDU-ST trained with the 3-stage framework demonstrates significant improvement over the competing models. The proposed method can be used to conduct automated 3D lesion segmentation to assist large-scale radiomics and tumor growth modeling studies.
引用
收藏
页数:5
相关论文
共 50 条
  • [11] UGS-M3F: unified gated swin transformer with multi-feature fully fusion for retinal blood vessel segmentation
    Bakkouri, Ibtissam
    Bakkouri, Siham
    BMC MEDICAL IMAGING, 2025, 25 (01):
  • [12] 3D Medical Axial Transformer: A Lightweight Transformer Model for 3D Brain Tumor Segmentation
    Liu, Cheng
    Kiryu, Hisanori
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 799 - 813
  • [13] Swin transformer with multiscale 3D atrous convolution for hyperspectral image classification
    Farooque, Ghulam
    Liu, Qichao
    Sargano, Allah Bux
    Xiao, Liang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [14] TransDoubleU-Net: Dual Scale Swin Transformer With Dual Level Decoder for 3D Multimodal Brain Tumor Segmentation
    Vatanpour, Marjan
    Haddadnia, Javad
    IEEE ACCESS, 2023, 11 : 125511 - 125518
  • [15] 3D geometrical segmentation and reconstruction of anatomical structures
    Bueno, G
    Flores, C
    Martinez, A
    Cosias, P
    MEDICAL IMAGING 2005: VISUALIZATION, IMAGE-GUIDED PROCEDURES, AND DISPLAY, PTS 1 AND 2, 2005, 5744 : 43 - 52
  • [16] Stratified Transformer for 3D Point Cloud Segmentation
    Lai, Xin
    Liu, Jianhui
    Jiang, Li
    Wang, Liwei
    Zhao, Hengshuang
    Liu, Shu
    Qi, Xiaojuan
    Jia, Jiaya
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8490 - 8499
  • [17] Superpoint Transformer for 3D Scene Instance Segmentation
    Sun, Jiahao
    Qing, Chunmei
    Tan, Junpeng
    Xu, Xiangmin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2393 - 2401
  • [18] Efficient 3D Semantic Segmentation with Superpoint Transformer
    Robert, Damien
    Raguet, Hugo
    Landrieu, Loic
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17149 - 17158
  • [19] Query Refinement Transformer for 3D Instance Segmentation
    Lu, Jiahao
    Deng, Jiacheng
    Wang, Chuxin
    He, Jianfeng
    Zhang, Tianzhu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18470 - 18480
  • [20] Uni3DETR: Unified 3D Detection Transformer
    Wang, Zhenyu
    Li, Yali
    Chen, Xi
    Zhao, Hengshuang
    Wang, Shengjin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,