LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

被引：80

作者：

Xu, Guoping ^{[1
]}

Zhang, Xuan ^{[1
]}

He, Xinwei ^{[2
]}

Wu, Xinglong ^{[1
]}

机构：

[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Hubei Key Lab Intelligent Robot, Wuhan 430205, Hubei, Peoples R China

[2] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Hubei, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

关键词：

Medical Image Segmentation; Transformer; Convolutional Neural Network;

D O I：

10.1007/978-981-99-8543-2_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Medical image segmentation plays an essential role in developing computer-assisted diagnosis and treatment systems, yet it still faces numerous challenges. In the past few years, Convolutional Neural Networks (CNNs) have been successfully applied to the task of medical image segmentation. Regrettably, due to the locality of convolution operations, these CNN-based architectures have their limitations in learning global context information in images, which might be crucial to the success of medical image segmentation. Meanwhile, the vision Transformer (ViT) architectures own the remarkable ability to extract long-range semantic features with the shortcoming of their computation complexity. To make medical image segmentation more efficient and accurate, we present a novel light-weight architecture named LeViT-UNet, which integrates multi-stage Transformer blocks in the encoder via LeViT, aiming to explore the effectiveness of fusion between local and global features together. Our experiments on two challenging segmentation benchmarks indicate that the proposed LeViT-UNet achieved competitive performance compared with various state-of-the-art methods in terms of efficiency and accuracy, suggesting that LeViT can be a faster feature encoder for medical images segmentation. LeViT-UNet-384, for instance, achieves Dice similarity coefficient (DSC) of 78.53% and 90.32% with a segmentation speed of 85 frames per second (FPS) in the Synapse and ACDC datasets, respectively. Therefore, the proposed architecture could be beneficial for prospective clinic trials conducted by the radiologists. Our source codes are publicly available at https://github.com/apple1986/LeViT_UNet.

引用

页码：42 / 53

页数：12

共 50 条

[1] AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
Yan, Xiangyi
Tang, Hao
Sun, Shanlin
Ma, Haoyu
Kong, Deying
Xie, Xiaohui
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3270 - 3280
[2] TransCUNet: UNet cross fused transformer for medical image segmentation
Jiang, Shen
Li, Jinjiang
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
[3] SMESwin Unet: Merging CNN and Transformer for Medical Image Segmentation
Wang, Ziheng
Min, Xiongkuo
Shi, Fangyu
Jin, Ruinian
Nawrin, Saida S.
Yu, Ichen
Nagatomi, Ryoichi
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 517 - 526
[4] CSWin-UNet: Transformer UNet with cross-shaped windows for medical image segmentation
Liu, Xiao
Gao, Peng
Yu, Tao
Wang, Fei
Yuan, Ru-Yue
INFORMATION FUSION, 2025, 113
[5] A novel full-convolution UNet-transformer for medical image segmentation
Zhu, Tianyou
Ding, Derui
Wang, Feng
Liang, Wei
Wang, Bo
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
[6] DSTUNET: UNET WITH EFFICIENT DENSE SWIN TRANSFORMER PATHWAY FOR MEDICAL IMAGE SEGMENTATION
Cai, Zhuotong
Xin, Jingmin
Shi, Peiwen
Wu, Jiayi
Zheng, Nanning
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
[7] ConvWin-UNet: UNet-like hierarchical vision Transformer combined with convolution for medical image segmentation
Feng, Xiaomeng
Wang, Taiping
Yang, Xiaohang
Zhang, Minfei
Guo, Wanpeng
Wang, Weina
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (01) : 128 - 144
[8] SwinE-UNet3+: swin transformer encoder network for medical image segmentation
Zou, Ping
Wu, Jian-Sheng
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2023, 12 (01) : 99 - 105
[9] TCI-UNet: transformer-CNN interactive module for medical image segmentation
Bian, Xuan
Wang, Guanglei
Li, Yan
Wang, Hongrui
BIOMEDICAL OPTICS EXPRESS, 2023, 14 (11) : 5904 - 5920
[10] SwinE-UNet3+: swin transformer encoder network for medical image segmentation
Ping Zou
Jian-Sheng Wu
Progress in Artificial Intelligence, 2023, 12 : 99 - 105

← 1 2 3 4 5 →