Enhanced transformer encoder and hybrid cascaded upsampler for medical image segmentation

被引:0
|
作者
Li, Chaoqun [1 ]
Wang, Liejun [1 ]
Cheng, Shuli [1 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Xinjiang, Peoples R China
关键词
Medical image segmentation; Convolution neural network; Enhanced transformer; Hybrid cascaded upsampler; NETWORK; NET;
D O I
10.1016/j.eswa.2023.121965
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
UNet has been highly successful in various medical image segmentation tasks, but the restricted field of perception of convolutional operations has led to the lack of UNet's ability to explicitly model global context information. Vision Transformer captures global relevance through self-attention (SA), thus alleviating the problem of perceived wild locality in convolution neural network (CNN) architectures. However, traditional Transformer typically by means of SA with high computational complexity, and the fusion mechanism is static MLP mode, which is not efficient enough. In addition, the current segmentation methods usually perform simple feature fusion on the decoder side of the U-shaped architecture, which cannot meet the potential demand for important features when generating predictive maps. To solve these problems, we propose the E-TUNet network. On the one hand, we designed the Enhanced Transformer as the encoder by introducing EMSA and DynaMixer MLP. The Enhanced Transformer has high computational efficiency and dynamic mixing weights, which alleviates the problem of single static fusion mechanism. On the other hand, we introduce G-L MLP block with global-local space interaction capability to form hybrid cascaded upsampler for importance computation and matching of decoder side features. The hybrid cascaded upsampler has stronger information representation capabilities and effectively combines CNN and MLP to capture local and global dependencies. We demonstrate the effectiveness of our E-TUNet on two different public available datasets. Extensive experiments have shown that our method is highly competitive compared to other methods. In particular, on publicly available datasets (Synapse and ACDC), the mean DSC (%) is 82.15 and 91.12, respectively. HD95 (mm) is 17.89 on the Synapse dataset. E-TUNet has achieved significant performance improvement in multi-organ segmentation tasks, reaching a advanced level.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A hybrid enhanced attention transformer network for medical ultrasound image segmentation
    Jiang, Tao
    Xing, Wenyu
    Yu, Ming
    Ta, Dean
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
  • [2] Hybrid Transformer and Convolution for Medical Image Segmentation
    Wang, Fan
    Wang, Bo
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 156 - 159
  • [3] UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation
    Gao, Yunhe
    Zhou, Mu
    Metaxas, Dimitris N.
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT III, 2021, 12903 : 61 - 71
  • [4] Encoder Activation Diffusion and Decoder Transformer Fusion Network for Medical Image Segmentation
    Li, Xueru
    Xu, Guoxia
    Zhao, Meng
    Shi, Fan
    Wang, Hao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 185 - 197
  • [5] SEGTRANSVAE: HYBRID CNN - TRANSFORMER WITH REGULARIZATION FOR MEDICAL IMAGE SEGMENTATION
    Quan-Dung Pham
    Hai Nguyen-Truong
    Nam Nguyen Phuong
    Nguyen, Khoa N. A.
    Nguyen, Chanh D. T.
    Bui, Trung
    Truong, Steven Q. H.
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [6] SwinE-UNet3+: swin transformer encoder network for medical image segmentation
    Zou, Ping
    Wu, Jian-Sheng
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2023, 12 (01) : 99 - 105
  • [7] SwinE-UNet3+: swin transformer encoder network for medical image segmentation
    Ping Zou
    Jian-Sheng Wu
    Progress in Artificial Intelligence, 2023, 12 : 99 - 105
  • [8] Alternate encoder and dual decoder CNN-Transformer networks for medical image segmentation
    Lin Zhang
    Xinyu Guo
    Hongkun Sun
    Weigang Wang
    Liwei Yao
    Scientific Reports, 15 (1)
  • [9] DECTNet: Dual Encoder Network combined convolution and Transformer architecture for medical image segmentation
    Li, Boliang
    Xu, Yaming
    Wang, Yan
    Zhang, Bo
    PLOS ONE, 2024, 19 (04):
  • [10] Transformer and group parallel axial attention co-encoder for medical image segmentation
    Li, Chaoqun
    Wang, Liejun
    Li, Yongming
    SCIENTIFIC REPORTS, 2022, 12 (01):