Enhanced transformer encoder and hybrid cascaded upsampler for medical image segmentation

被引:0
|
作者
Li, Chaoqun [1 ]
Wang, Liejun [1 ]
Cheng, Shuli [1 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Xinjiang, Peoples R China
关键词
Medical image segmentation; Convolution neural network; Enhanced transformer; Hybrid cascaded upsampler; NETWORK; NET;
D O I
10.1016/j.eswa.2023.121965
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
UNet has been highly successful in various medical image segmentation tasks, but the restricted field of perception of convolutional operations has led to the lack of UNet's ability to explicitly model global context information. Vision Transformer captures global relevance through self-attention (SA), thus alleviating the problem of perceived wild locality in convolution neural network (CNN) architectures. However, traditional Transformer typically by means of SA with high computational complexity, and the fusion mechanism is static MLP mode, which is not efficient enough. In addition, the current segmentation methods usually perform simple feature fusion on the decoder side of the U-shaped architecture, which cannot meet the potential demand for important features when generating predictive maps. To solve these problems, we propose the E-TUNet network. On the one hand, we designed the Enhanced Transformer as the encoder by introducing EMSA and DynaMixer MLP. The Enhanced Transformer has high computational efficiency and dynamic mixing weights, which alleviates the problem of single static fusion mechanism. On the other hand, we introduce G-L MLP block with global-local space interaction capability to form hybrid cascaded upsampler for importance computation and matching of decoder side features. The hybrid cascaded upsampler has stronger information representation capabilities and effectively combines CNN and MLP to capture local and global dependencies. We demonstrate the effectiveness of our E-TUNet on two different public available datasets. Extensive experiments have shown that our method is highly competitive compared to other methods. In particular, on publicly available datasets (Synapse and ACDC), the mean DSC (%) is 82.15 and 91.12, respectively. HD95 (mm) is 17.89 on the Synapse dataset. E-TUNet has achieved significant performance improvement in multi-organ segmentation tasks, reaching a advanced level.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Medical Image Segmentation Using Transformer Networks
    Karimi, Davood
    Dou, Haoran
    Gholipour, Ali
    IEEE ACCESS, 2022, 10 : 29322 - 29332
  • [22] ATFormer: Advanced transformer for medical image segmentation
    Chen, Yong
    Lu, Xuesong
    Xie, Oinlan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [23] The Fully Convolutional Transformer for Medical Image Segmentation
    Tragakis, Athanasios
    Kaul, Chaitanya
    Murray-Smith, Roderick
    Husmeier, Dirk
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3649 - 3658
  • [24] Automatic Medical Image Segmentation with Vision Transformer
    Zhang, Jie
    Li, Fan
    Zhang, Xin
    Wang, Huaijun
    Hei, Xinhong
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [25] Medical Image Segmentation via Cascaded Attention Decoding
    Rahman, Md Mostafijur
    Marculescu, Radu
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6211 - 6220
  • [26] H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation
    He, Along
    Wang, Kai
    Li, Tao
    Du, Chengkun
    Xia, Shuang
    Fu, Huazhu
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (09) : 2763 - 2775
  • [27] FDR-TransUNet: A novel encoder-decoder architecture with vision transformer for improved medical image segmentation
    Zhang, Chaoyang
    Sun, Shibao
    Hu, Wenmao
    Zhao, Pengcheng
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [28] HTC-Net: A hybrid CNN-transformer framework for medical image segmentation
    Tang, Hui
    Chen, Yuanbin
    Wang, Tao
    Zhou, Yuanbo
    Zhao, Longxuan
    Gao, Qinquan
    Du, Min
    Tan, Tao
    Zhang, Xinlin
    Tong, Tong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [29] Multiscale transunet +  + : dense hybrid U-Net with transformer for medical image segmentation
    Bo Wang
    ·Fan Wang
    Pengwei Dong
    ·Chongyi Li
    Signal, Image and Video Processing, 2022, 16 : 1607 - 1614
  • [30] Cross Attention Multi Scale CNN-Transformer Hybrid Encoder Is General Medical Image Learner
    Zhou, Rongzhou
    Yao, Junfeng
    Hong, Qingqi
    Li, Xingxin
    Cao, Xianpeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 85 - 97