Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation

被引:8
|
作者
Zheng, Shenhai [1 ,2 ]
Tan, Jiaxin [1 ]
Jiang, Chuangbo [3 ]
Li, Laquan [1 ,3 ]
机构
[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci &Technol, Chongqing 400065, Peoples R China
[2] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Sch Sci, Chongqing 400065, Peoples R China
来源
PHYSICS IN MEDICINE AND BIOLOGY | 2023年 / 68卷 / 02期
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
medical image segmentation; Transformer; multi-modal; feature fusion; TUMOR SEGMENTATION;
D O I
10.1088/1361-6560/aca74c
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective. Over the past years, convolutional neural networks based methods have dominated the field of medical image segmentation. But the main drawback of these methods is that they have difficulty representing long-range dependencies. Recently, the Transformer has demonstrated super performance in computer vision and has also been successfully applied to medical image segmentation because of the self-attention mechanism and long-range dependencies encoding on images. To the best of our knowledge, only a few works focus on cross-modalities of image segmentation using the Transformer. Hence, the main objective of this study was to design, propose and validate a deep learning method to extend the application of Transformer to multi-modality medical image segmentation. Approach. This paper proposes a novel automated multi-modal Transformer network termed AMTNet for 3D medical image segmentation. Especially, the network is a well-modeled U-shaped network architecture where many effective and significant changes have been made in the feature encoding, fusion, and decoding parts. The encoding part comprises 3D embedding, 3D multi-modal Transformer, and 3D Co-learn down-sampling blocks. Symmetrically, the 3D Transformer block, upsampling block, and 3D-expanding blocks are included in the decoding part. In addition, a Transformer-based adaptive channel interleaved Transformer feature fusion module is designed to fully fuse features of different modalities. Main results. We provide a comprehensive experimental analysis of the Prostate and BraTS2021 datasets. The results show that our method achieves an average DSC of 0.907 and 0.851 (0.734 for ET, 0.895 for TC, and 0.924 for WT) on these two datasets, respectively. These values show that AMTNet yielded significant improvements over the state-of-the-art segmentation networks. Significance. The proposed 3D segmentation network exploits complementary features of different modalities during the feature extraction process at multiple scales to increase the 3D feature representations and improve the segmentation efficiency. This powerful network enriches the research of the Transformer to multi-modal medical image segmentation.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] 3D Medical Multi-modal Segmentation Network Guided by Multi-source Correlation Constraint
    Zhou, Tongxue
    Canu, Stephane
    Vera, Pierre
    Ruan, Su
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10243 - 10250
  • [2] OctopusNet: A Deep Learning Segmentation Network for Multi-modal Medical Images
    Chen, Yu
    Chen, Jiawei
    Wei, Dong
    Li, Yuexiang
    Zheng, Yefeng
    MULTISCALE MULTIMODAL MEDICAL IMAGING, MMMI 2019, 2020, 11977 : 17 - 25
  • [3] UNIVERSAL MULTI-MODAL DEEP NETWORK FOR CLASSIFICATION AND SEGMENTATION OF MEDICAL IMAGES
    Harouni, Ahmed
    Karargyris, Alexandros
    Negahdar, Mohammadreza
    Beymer, David
    Syeda-Mahmood, Tanveer
    2018 IEEE 15TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2018), 2018, : 872 - 876
  • [4] A framework for unsupervised segmentation of multi-modal medical images
    El-Baz, Ayman
    Farag, Aly
    Ali, Asem
    Gimel'farb, Georgy
    Casanova, Manuel
    COMPUTER VISION APPROACHES TO MEDICAL IMAGE ANALYSIS, 2006, 4241 : 120 - 131
  • [5] 3D deeply supervised network for automated segmentation of volumetric medical images
    Dou, Qi
    Yu, Lequan
    Chen, Hao
    Jin, Yueming
    Yang, Xin
    Qin, Jing
    Heng, Pheng-Ann
    MEDICAL IMAGE ANALYSIS, 2017, 41 : 40 - 54
  • [6] SEGMENTATION OF INFLAMED SYNOVIA IN MULTI-MODAL 3D MRI
    Basso, Curzio
    Santoro, Matteo
    Verri, Alessandro
    Esposito, Mario
    2009 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, VOLS 1 AND 2, 2009, : 229 - +
  • [7] AAFormer: A Multi-Modal Transformer Network for Aerial Agricultural Images
    Shen, Yao
    Wang, Lei
    Jin, Yue
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1704 - 1710
  • [8] Joint segmentation of tumors in 3D PET-CT images with a network fusing multi-view and multi-modal information
    Zheng, Haoyang
    Zou, Wei
    Hu, Nan
    Wang, Jiajun
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (20):
  • [9] Multi-modal Transformer for Brain Tumor Segmentation
    Cho, Jihoon
    Park, Jinah
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 138 - 148
  • [10] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
    Zhang, Menghui
    Zhang, Yuchen
    Liu, Shuaibing
    Han, Yahui
    Cao, Honggang
    Qiao, Bingbing
    SCIENTIFIC REPORTS, 2024, 14 (01):