TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers

被引：2

作者：

Chen, Jieneng ^{[1
]}

Mei, Jieru ^{[1
]}

Li, Xianhang ^{[2
]}

Lu, Yongyi ^{[1
]}

Yu, Qihang ^{[1
,3
]}

Wei, Qingyue

Luo, Xiangde ^{[4
]}

Xie, Yutong ^{[8
]}

Adeli, Ehsan ^{[5
]}

Wang, Yan ^{[7
]}

Lungren, Matthew P. ^{[5
]}

Zhang, Shaoting ^{[4
]}

Xing, Lei ^{[3
]}

Lu, Le ^{[6
]}

Yuille, Alan ^{[1
]}

Zhou, Yuyin ^{[2
]}

机构：

[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA

[2] Univ Calif Santa Cruz, Dept Comp Sci & Engn, Santa Cruz, CA 95064 USA

[3] Stanford Univ, Dept Radiat Oncol, Stanford, CA 94305 USA

[4] Shanghai AI Lab, Shanghai 200000, Peoples R China

[5] Stanford Univ, Sch Med, Stanford, CA 94305 USA

[6] Alibaba Grp, DAMO Acad, New York, NY 10014 USA

[7] East China Normal Univ, Shanghai 200062, Peoples R China

[8] Univ Adelaide, Australian Inst Machine Learning, Adelaide, Australia

来源：

MEDICAL IMAGE ANALYSIS | 2024年 / 97卷

关键词：

Medical image segmentation; Vision Transformers; U-Net;

D O I：

10.1016/j.media.2024.103280

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.

引用

页数：10

共 50 条

[41] Medical image segmentation based on state transition algorithm and U-net
Zhou, Xiaojun
Geng, Chuanyu
Yang, Chunhua
Zhongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Central South University (Science and Technology), 2023, 54 (04): : 1358 - 1369
[42] Shape-intensity-guided U-net for medical image segmentation
Dong, Wenhui
Du, Bo
Xu, Yongchao
NEUROCOMPUTING, 2024, 610
[43] Application of U-Net and Optimized Clustering in Medical Image Segmentation: A Review
Shao, Jiaqi
Chen, Shuwen
Zhou, Jin
Zhu, Huisheng
Wang, Ziyi
Brown, Mackenzie
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 136 (03): : 2173 - 2219
[44] Recurrent Residual U-Net with EfficientNet Encoder for Medical Image Segmentation
Siddique, Nahian
Paheding, Sidike
Alom, Md Zahangir
Devabhaktuni, Vijaya
PATTERN RECOGNITION AND TRACKING XXXII, 2021, 11735
[45] Hybrid dilation and attention residual U-Net for medical image segmentation
Wang, Zekun
Zou, Yanni
Liu, Peter X.
COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 134
[46] SACNet: Shuffling atrous convolutional U-Net for medical image segmentation
Wang, Shaofan
Liu, Yukun
Sun, Yanfeng
Yin, Baocai
IET IMAGE PROCESSING, 2023, 17 (04) : 1236 - 1252
[47] A Densely Connected Network Based on U-Net for Medical Image Segmentation
Yang, Zhenzhen
Xu, Pengfei
Yang, Yongpeng
Bao, Bing-Kun
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
[48] Mixed-Precision Quantization of U-Net for Medical Image Segmentation
Guo, Liming
Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2871 - 2875
[49] Enhancing medical image segmentation with a multi-transformer U-Net
Dan, Yongping
Jin, Weishou
Yue, Xuebin
Wang, Zhida
PEERJ, 2024, 12
[50] Medical Image Segmentation with Stochastic Aggregated Loss in a Unified U-Net
Phi Xuan Nguyen
Lu, Zhongkang
Huang, Weimin
Huang, Su
Katsuki, Akie
Lin, Zhiping
2019 IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL & HEALTH INFORMATICS (BHI), 2019,

← 1 2 3 4 5 →