LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

被引：80

作者：

Xu, Guoping ^{[1
]}

Zhang, Xuan ^{[1
]}

He, Xinwei ^{[2
]}

Wu, Xinglong ^{[1
]}

机构：

[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Hubei Key Lab Intelligent Robot, Wuhan 430205, Hubei, Peoples R China

[2] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Hubei, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

关键词：

Medical Image Segmentation; Transformer; Convolutional Neural Network;

D O I：

10.1007/978-981-99-8543-2_4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Medical image segmentation plays an essential role in developing computer-assisted diagnosis and treatment systems, yet it still faces numerous challenges. In the past few years, Convolutional Neural Networks (CNNs) have been successfully applied to the task of medical image segmentation. Regrettably, due to the locality of convolution operations, these CNN-based architectures have their limitations in learning global context information in images, which might be crucial to the success of medical image segmentation. Meanwhile, the vision Transformer (ViT) architectures own the remarkable ability to extract long-range semantic features with the shortcoming of their computation complexity. To make medical image segmentation more efficient and accurate, we present a novel light-weight architecture named LeViT-UNet, which integrates multi-stage Transformer blocks in the encoder via LeViT, aiming to explore the effectiveness of fusion between local and global features together. Our experiments on two challenging segmentation benchmarks indicate that the proposed LeViT-UNet achieved competitive performance compared with various state-of-the-art methods in terms of efficiency and accuracy, suggesting that LeViT can be a faster feature encoder for medical images segmentation. LeViT-UNet-384, for instance, achieves Dice similarity coefficient (DSC) of 78.53% and 90.32% with a segmentation speed of 85 frames per second (FPS) in the Synapse and ACDC datasets, respectively. Therefore, the proposed architecture could be beneficial for prospective clinic trials conducted by the radiologists. Our source codes are publicly available at https://github.com/apple1986/LeViT_UNet.

引用

页码：42 / 53

页数：12

共 50 条

[21] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
Fan, Lili
Zhou, Yu
Liu, Hongmei
Li, Yunjie
Cao, Dongpu
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
[22] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
He, Xin
Zhou, Yong
Zhao, Jiaqi
Zhang, Di
Yao, Rui
Xue, Yong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[23] Token Sparsification for Faster Medical Image Segmentation
Zhou, Lei
Liu, Huidong
Bae, Joseph
He, Junjun
Samaras, Dimitris
Prasanna, Prateek
INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2023, 2023, 13939 : 743 - 754
[24] CoT-UNet plus plus : A medical image segmentation method based on contextual transformer and dense connection
Yin, Yijun
Xu, Wenzheng
Chen, Lei
Wu, Hao
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (05) : 8320 - 8336
[25] STU3: Multi-organ CT Medical Image Segmentation Model Based on Transformer and UNet
Zheng, Wenjin
Li, Bo
Chen, Wanyi
ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 170 - 181
[26] UNET 3+: A FULL-SCALE CONNECTED UNET FOR MEDICAL IMAGE SEGMENTATION
Huang, Huimin
Lin, Lanfen
Tong, Ruofeng
Hu, Hongjie
Zhang, Qiaowei
Iwamoto, Yutaro
Han, Xianhua
Chen, Yen-Wei
Wu, Jian
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1055 - 1059
[27] Light-UNet: An Efficient Segmentation Network for Medical Image
Zhang, Yue
Xu, Chao
Zhang, Zhifan
Wang, Jianjun
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024, 2024, 14867 : 302 - 313
[28] Semantic Segmentation in Medical Image Based on Hybrid Dlinknet and Unet
Samudrala, Suresh
Mohan, C. Krishna
3rd IEEE 2022 International Conference on Computing, Communication, and Intelligent Systems, ICCCIS 2022, 2022, : 42 - 47
[29] Vision Mamba and xLSTM-UNet for medical image segmentation
Xin Zhong
Gehao Lu
Hao Li
Scientific Reports, 15 (1)
[30] Hybrid Shunted Transformer embedding UNet for remote sensing image semantic segmentation
Zhou H.
Xiao X.
Li H.
Liu X.
Liang P.
Neural Computing and Applications, 2024, 36 (25) : 15705 - 15720

← 1 2 3 4 5 →