Multi-dimension unified Swin Transformer for 3D Lesion Segmentation in Multiple Anatomical Locations

被引:0
|
作者
Pan, Shaoyan [1 ]
Liu, Yiqiao [2 ]
Halek, Sarah [2 ]
Tomaszewski, Michal [2 ]
Wang, Shubing [2 ]
Baumgartner, Richard [2 ]
Yuan, Jianda [2 ]
Goldmacher, Gregory [2 ]
Chen, Antong [2 ]
机构
[1] Emory Univ, Dept Biomed Informat, Atlanta, GA 30322 USA
[2] Merck & Co Inc, Rahway, NJ USA
来源
2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI | 2023年
关键词
Lesion segmentation; pre-training; Swin transformer;
D O I
10.1109/ISBI53787.2023.10230562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In oncology research, accurate 3D segmentation of lesions from CT scans is essential for the extraction of 3D radiomics features in lesions and the modeling of lesion growth kinetics. However, following the RECIST criteria, radiologists routinely only delineate each lesion on the axial slice showing the largest transverse area, and occasionally delineate a small number of lesions in 3D for research purposes. As a result, to train models to segment the lesions automatically, we normally have plenty of unlabeled 3D volumes, an adequate number of labeled 2D images, and scarce labeled 3D volumes, which makes training a 3D segmentation model a challenging task. In this work, we propose a novel U-shaped deep learning model, denoted a multi-dimension unified Swin transformer (MDU-ST), to automatically conduct 3D lesion segmentation. The MDU-ST consists of a Shifted-window transformer (Swin-transformer) encoder and a convolutional neural network (CNN) decoder, allowing it to adapt to 2D and 3D inputs and learn the corresponding semantic information from various inputs in the same encoder. Based on this model, we introduce a three-stage framework to train the model effectively: 1) leveraging large amount of unlabeled 3D lesion volumes through multiple self-supervised pretext tasks to learn the underlying pattern of lesion anatomy in the Swin-transformer encoder; 2) fine-tune the Swin-transformer encoder to perform 2D lesion segmentation with 2D RECIST slices to learn slice-level segmentation information; 3) further fine-tune the Swin-transformer encoder to perform 3D lesion segmentation with labeled 3D volumes to learn volume-level segmentation information. We compare the proposed MDU-ST with state-of-the-art CNN-based and transformer-based segmentation models using an internal lesion dataset with 593 lesions extracted from multiple anatomical locations and delineated in 3D. The network's performance is evaluated by the Dice similarity coefficient (DSC) for volume-based accuracy and Hausdorff distance (HD) for surface-based accuracy. The average DSC achieved by the MDU-ST with proposed pipeline is 0.78; HD is 5.55 mm. The proposed MDU-ST trained with the 3-stage framework demonstrates significant improvement over the competing models. The proposed method can be used to conduct automated 3D lesion segmentation to assist large-scale radiomics and tumor growth modeling studies.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] SwinVI:3D Swin Transformer Model with U-net for Video Inpainting
    Zhang, Wei
    Cao, Yang
    Zhai, Junhai
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [22] OneFormer3D: One Transformer for Unified Point Cloud Segmentation
    Kolodiazhnyi, Maxim
    Vorontsova, Anna
    Konushin, Anton
    Rukhovich, Danila
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 20943 - 20953
  • [23] LumVertCancNet: A novel 3D lumbar vertebral body cancellous bone location and segmentation method based on hybrid Swin-transformer
    Zhang, Yingdi
    Shi, Zelin
    Wang, Huan
    Cui, Shaoqian
    Zhang, Lei
    Liu, Jiachen
    Shan, Xiuqi
    Liu, Yunpeng
    Fang, Lei
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 171
  • [24] Multi-scale Knowledge Transfer Vision Transformer for 3D vessel shape segmentation
    Hua, Michael J.
    Wu, Junjie
    Zhong, Zichun
    COMPUTERS & GRAPHICS-UK, 2024, 122
  • [25] Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation
    Zheng, Shenhai
    Tan, Jiaxin
    Jiang, Chuangbo
    Li, Laquan
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (02):
  • [26] A Unified Point-Based Framework for 3D Segmentation
    Chiang, Hung-Yueh
    Lin, Yen-Liang
    Liu, Yueh-Cheng
    Hsu, Winston H.
    2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 155 - 163
  • [27] Robust 3D segmentation of anatomical structures with level sets
    Baillard, C
    Barillot, C
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2000, 2000, 1935 : 236 - 245
  • [28] 3D human body modeling with orthogonal human mask image based on multi-channel Swin transformer architecture
    Li, Xihang
    Li, Guiqin
    Li, Ming
    Liu, Kuiliang
    Mitrouchev, Peter
    IMAGE AND VISION COMPUTING, 2023, 137
  • [29] MeT: A graph transformer for semantic segmentation of 3D meshes
    Vecchio, Giuseppe
    Prezzavento, Luca
    Pino, Carmelo
    Rundo, Francesco
    Palazzo, Simone
    Spampinato, Concetto
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 235
  • [30] Dynamic Linear Transformer for 3D Biomedical Image Segmentation
    Zhang, Zheyuan
    Bagci, Ulas
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2022, 2022, 13583 : 171 - 180