LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

被引:80
|
作者
Xu, Guoping [1 ]
Zhang, Xuan [1 ]
He, Xinwei [2 ]
Wu, Xinglong [1 ]
机构
[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Hubei Key Lab Intelligent Robot, Wuhan 430205, Hubei, Peoples R China
[2] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Hubei, Peoples R China
关键词
Medical Image Segmentation; Transformer; Convolutional Neural Network;
D O I
10.1007/978-981-99-8543-2_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical image segmentation plays an essential role in developing computer-assisted diagnosis and treatment systems, yet it still faces numerous challenges. In the past few years, Convolutional Neural Networks (CNNs) have been successfully applied to the task of medical image segmentation. Regrettably, due to the locality of convolution operations, these CNN-based architectures have their limitations in learning global context information in images, which might be crucial to the success of medical image segmentation. Meanwhile, the vision Transformer (ViT) architectures own the remarkable ability to extract long-range semantic features with the shortcoming of their computation complexity. To make medical image segmentation more efficient and accurate, we present a novel light-weight architecture named LeViT-UNet, which integrates multi-stage Transformer blocks in the encoder via LeViT, aiming to explore the effectiveness of fusion between local and global features together. Our experiments on two challenging segmentation benchmarks indicate that the proposed LeViT-UNet achieved competitive performance compared with various state-of-the-art methods in terms of efficiency and accuracy, suggesting that LeViT can be a faster feature encoder for medical images segmentation. LeViT-UNet-384, for instance, achieves Dice similarity coefficient (DSC) of 78.53% and 90.32% with a segmentation speed of 85 frames per second (FPS) in the Synapse and ACDC datasets, respectively. Therefore, the proposed architecture could be beneficial for prospective clinic trials conducted by the radiologists. Our source codes are publicly available at https://github.com/apple1986/LeViT_UNet.
引用
收藏
页码:42 / 53
页数:12
相关论文
共 50 条
  • [41] EMED-UNet: An Efficient Multi-Encoder-Decoder Based UNet for Medical Image Segmentation
    Shah, Kashish D.
    Patel, Dhaval K.
    Thaker, Minesh P.
    Patel, Harsh A.
    Saikia, Manob Jyoti
    Ranger, Bryan J.
    IEEE ACCESS, 2023, 11 : 95253 - 95266
  • [42] UTR: A UNet-like transformer for efficient unsupervised medical image registration
    Qiu, Wei
    Xiong, Lianjin
    Li, Ning
    Wang, Yaobin
    Zhang, Yangsong
    IMAGE AND VISION COMPUTING, 2024, 150
  • [43] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Cai, Yimin
    Long, Yuqing
    Han, Zhenggong
    Liu, Mingkun
    Zheng, Yuchen
    Yang, Wei
    Chen, Liming
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [44] DMFC-UFormer: Depthwise multi-scale factorized convolution transformer-based UNet for medical image segmentation
    Garbaz, Anass
    Oukdach, Yassine
    Charfi, Said
    Ansari, Mohamed El
    Koutti, Lahcen
    Salihoun, Mouna
    Biomedical Signal Processing and Control, 2025, 101
  • [45] Remote Sensing Image Road Segmentation Method Integrating CNN-Transformer and UNet
    Wang, Rui
    Cai, Mingxiang
    Xia, Zixuan
    Zhou, Zhicui
    IEEE ACCESS, 2023, 11 : 144446 - 144455
  • [46] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Yimin Cai
    Yuqing Long
    Zhenggong Han
    Mingkun Liu
    Yuchen Zheng
    Wei Yang
    Liming Chen
    BMC Medical Informatics and Decision Making, 23
  • [47] RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation
    Tang, Hao
    Huang, Guoheng
    Cheng, Lianglun
    Yuan, Xiaochen
    Tao, Qi
    Chen, Xuhang
    Zhong, Guo
    Yang, Xiaohui
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, : 8427 - 8443
  • [48] TRC-Unet: Transformer Connections for Near-infrared Blurred Image Segmentation
    Wang, Jiazhe
    Osamu, Yoshie
    Shimizu, Koichi
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4211 - 4218
  • [49] Dilated-UNet: A Fast and Accurate Medical Image Segmentation Approach using a Dilated Transformer and U-Net Architecture
    Saadati, Davoud
    Manzari, Omid Nejati
    Mirzakuchaki, Sattar
    arXiv, 2023,
  • [50] STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multiscale MLP for Medical Image Segmentation
    Shi, Lei
    Gao, Tianyu
    Zhang, Zheng
    Zhang, Junxing
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 2003 - 2008