UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

被引:308
|
作者
Gao, Yunhe [1 ]
Zhou, Mu [1 ,2 ]
Metaxas, Dimitris N. [1 ]
机构
[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
[2] SenseBrain & Shanghai AI Lab & Ctr Perceptual & I, Shanghai, Peoples R China
关键词
D O I
10.1007/978-3-030-87199-4_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer architecture has emerged to be successful in a number of natural language processing tasks. However, its applications to medical vision remain largely unexplored. In this study, we present UTNet, a simple yet powerful hybrid Transformer architecture that integrates self-attention into a convolutional neural network for enhancing medical image segmentation. UTNet applies self-attention modules in both encoder and decoder for capturing long-range dependency at different scales with minimal overhead. To this end, we propose an efficient self-attention mechanism along with relative position encoding that reduces the complexity of self-attention operation significantly from O(n(2)) to approximate O(n) . A new self-attention decoder is also proposed to recover fine-grained details from the skipped connections in the encoder. Our approach addresses the dilemma that Transformer requires huge amounts of data to learn vision inductive bias. Our hybrid layer design allows the initialization of Transformer into convolutional networks without a need of pre-training. We have evaluated UTNet on the multi-label, multi-vendor cardiac magnetic resonance imaging cohort. UTNet demonstrates superior segmentation performance and robustness against the state-of-the-art approaches, holding the promise to generalize well on other medical image segmentations.
引用
收藏
页码:61 / 71
页数:11
相关论文
共 50 条
  • [41] An Improved Hybrid Model for Medical Image Segmentation
    Yang Feng
    Sun Xiaohuan
    Chen Guoyue
    Wen Tiexiang
    2008 11TH IEEE SINGAPORE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS (ICCS), VOLS 1-3, 2008, : 367 - +
  • [42] HCA-former: Hybrid Convolution Attention Transformer for 3D Medical Image Segmentation
    Yang, Fan
    Wang, Fan
    Dong, Pengwei
    Wang, Bo
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
  • [43] Multiscale transunet plus plus : dense hybrid U-Net with transformer for medical image segmentation
    Wang, Bo
    Wang, Fan
    Dong, Pengwei
    Li, Chongyi
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (06) : 1607 - 1614
  • [44] TSE DeepLab: An efficient visual transformer for medical image segmentation
    Yang, Jingdong
    Tu, Jun
    Zhang, Xiaolin
    Yu, Shaoqing
    Zheng, Xianyou
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
  • [45] TransCUNet: UNet cross fused transformer for medical image segmentation
    Jiang, Shen
    Li, Jinjiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 150
  • [46] SMESwin Unet: Merging CNN and Transformer for Medical Image Segmentation
    Wang, Ziheng
    Min, Xiongkuo
    Shi, Fangyu
    Jin, Ruinian
    Nawrin, Saida S.
    Yu, Ichen
    Nagatomi, Ryoichi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT V, 2022, 13435 : 517 - 526
  • [47] From CNN to Transformer: A Review of Medical Image Segmentation Models
    Yao, Wenjian
    Bai, Jiajun
    Liao, Wei
    Chen, Yuheng
    Liu, Mengjuan
    Xie, Yao
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (04): : 1529 - 1547
  • [48] MIXED TRANSFORMER U-NET FOR MEDICAL IMAGE SEGMENTATION
    Wang, Hongyi
    Xie, Shiao
    Lin, Lanfen
    Iwamoto, Yutaro
    Han, Xian-Hua
    Chen, Yen-Wei
    Tong, Ruofeng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2390 - 2394
  • [49] An effective CNN and Transformer complementary network for medical image segmentation
    Yuan, Feiniu
    Zhang, Zhengxiao
    Fang, Zhijun
    PATTERN RECOGNITION, 2023, 136
  • [50] MR-Trans: MultiResolution Transformer for medical image segmentation
    Zou, Yibo
    Ge, Yan
    Zhao, Linlin
    Li, Wei
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165