TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers

被引:2
|
作者
Chen, Jieneng [1 ]
Mei, Jieru [1 ]
Li, Xianhang [2 ]
Lu, Yongyi [1 ]
Yu, Qihang [1 ,3 ]
Wei, Qingyue
Luo, Xiangde [4 ]
Xie, Yutong [8 ]
Adeli, Ehsan [5 ]
Wang, Yan [7 ]
Lungren, Matthew P. [5 ]
Zhang, Shaoting [4 ]
Xing, Lei [3 ]
Lu, Le [6 ]
Yuille, Alan [1 ]
Zhou, Yuyin [2 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[2] Univ Calif Santa Cruz, Dept Comp Sci & Engn, Santa Cruz, CA 95064 USA
[3] Stanford Univ, Dept Radiat Oncol, Stanford, CA 94305 USA
[4] Shanghai AI Lab, Shanghai 200000, Peoples R China
[5] Stanford Univ, Sch Med, Stanford, CA 94305 USA
[6] Alibaba Grp, DAMO Acad, New York, NY 10014 USA
[7] East China Normal Univ, Shanghai 200062, Peoples R China
[8] Univ Adelaide, Australian Inst Machine Learning, Adelaide, Australia
关键词
Medical image segmentation; Vision Transformers; U-Net;
D O I
10.1016/j.media.2024.103280
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Implicit U-Net for Volumetric Medical Image Segmentation
    Marimont, Sergio Naval
    Tarroni, Giacomo
    MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 387 - 397
  • [22] Boundary Aware U-Net for Medical Image Segmentation
    Alahmadi, Mohammad D.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) : 9929 - 9940
  • [23] Medical Image Segmentation Review: The Success of U-Net
    Azad, Reza
    Aghdam, Ehsan Khodapanah
    Rauland, Amelie
    Jia, Yiwei
    Avval, Atlas Haddadi
    Bozorgpour, Afshin
    Karimijafarbigloo, Sanaz
    Cohen, Joseph Paul
    Adeli, Ehsan
    Merhof, Dorit
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (12) : 10076 - 10095
  • [24] Diffusion Transformer U-Net for Medical Image Segmentation
    Chowdary, G. Jignesh
    Yin, Zhaozheng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 622 - 631
  • [25] Recurrent residual U-Net for medical image segmentation
    Alom, Md Zahangir
    Yakopcic, Chris
    Hasan, Mahmudul
    Taha, Tarek M.
    Asari, Vijayan K.
    JOURNAL OF MEDICAL IMAGING, 2019, 6 (01)
  • [26] Local Adaptive U-net for Medical Image Segmentation
    Liu, Ning
    Liu, Liangliang
    Wang, Jianxin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 670 - 674
  • [27] Dual Stream Fusion U-Net Transformers for 3D Medical Image Segmentation
    Hong, Seungkyun
    Ahn, Sunghyun
    Jo, Youngwan
    Park, Sanghyun
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 301 - 308
  • [28] Boundary Aware U-Net for Medical Image Segmentation
    Mohammad D. Alahmadi
    Arabian Journal for Science and Engineering, 2023, 48 : 9929 - 9940
  • [29] Medical ultrasound image segmentation using Multi-Residual U-Net architecture
    Shereena V. B.
    Raju G.
    Multimedia Tools and Applications, 2024, 83 (9) : 27067 - 27088
  • [30] Medical ultrasound image segmentation using Multi-Residual U-Net architecture
    Shereena, V. B.
    Raju, G.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 27067 - 27088