UNETR plus plus : Delving Into Efficient and Accurate 3D Medical Image Segmentation

被引:4
|
作者
Shaker, Abdelrahman [1 ]
Maaz, Muhammad [1 ]
Rasheed, Hanoona [1 ]
Khan, Salman [1 ]
Yang, Ming-Hsuan [2 ,3 ,4 ]
Khan, Fahad Shahbaz [5 ,6 ]
机构
[1] Mohamed Bin Zayed Univ Artificial Intelligence, Comp Vis Dept, Abu Dhabi, U Arab Emirates
[2] Univ Calif Merced, Elect Engn & Comp Sci Dept, Merced, CA 95343 USA
[3] Yonsei Univ, Coll Comp, Seoul 03722, South Korea
[4] Google, Mountain View, CA 95344 USA
[5] Mohamed Bin Zayed Univ, Abu Dhabi, U Arab Emirates
[6] Linkoping Univ, Elect Engn Dept, S-58183 Linkoping, Sweden
关键词
Image segmentation; Three-dimensional displays; Transformers; Biomedical imaging; Complexity theory; Graphics processing units; Task analysis; Deep learning; efficient attention; hybrid architecture; medical image segmentation; TRANSFORMER;
D O I
10.1109/TMI.2024.3398728
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Owing to the success of transformer models, recent works study their applicability in 3D medical segmentation tasks. Within the transformer models, the self-attention mechanism is one of the main building blocks that strives to capture long-range dependencies, compared to the local convolutional-based design. However, the self-attention operation has quadratic complexity which proves to be a computational bottleneck, especially in volumetric medical imaging, where the inputs are 3D with numerous slices. In this paper, we propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features using a pair of inter-dependent branches based on spatial and channel attention. Our spatial attention formulation is efficient and has linear complexity with respect to the input. To enable communication between spatial and channel-focused branches, we share the weights of query and key mapping functions that provide a complimentary benefit (paired attention), while also reducing the complexity. Our extensive evaluations on five benchmarks, Synapse, BTCV, ACDC, BraTS, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy. On Synapse, our UNETR++ sets a new state-of-the-art with a Dice Score of 87.2%, while significantly reducing parameters and FLOPs by over 71%, compared to the best method in the literature. Our code and models are available at: https://tinyurl.com/2p87x5xn.
引用
收藏
页码:3377 / 3390
页数:14
相关论文
共 50 条
  • [1] UNETR: Transformers for 3D Medical Image Segmentation
    Hatamizadeh, Ali
    Tang, Yucheng
    Nath, Vishwesh
    Yang, Dong
    Myronenko, Andriy
    Landman, Bennett
    Roth, Holger R.
    Xu, Daguang
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1748 - 1758
  • [2] FOCUSNET plus plus : ATTENTIVE AGGREGATED TRANSFORMATIONS FOR EFFICIENT AND ACCURATE MEDICAL IMAGE SEGMENTATION
    Kaul, Chaitanya
    Pears, Nick
    Dai, Hang
    Murray-Smith, Roderick
    Manandhar, Suresh
    2021 IEEE 18TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2021, : 1042 - 1046
  • [3] Slim UNETR: Scale Hybrid Transformers to Efficient 3D Medical Image Segmentation Under Limited Computational Resources
    Pang, Yan
    Liang, Jiaming
    Huang, Teng
    Chen, Hao
    Li, Yunhao
    Li, Dan
    Huang, Lin
    Wang, Qiong
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (03) : 994 - 1005
  • [4] Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds
    Liu, Zihao
    Xu, Xiaowei
    Liu, Tao
    Liu, Qi
    Wang, Yanzhi
    Shi, Yiyu
    Wen, Wujie
    Huang, Meiping
    Yuan, Haiyun
    Zhuang, Jian
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12679 - 12688
  • [5] ResUNet plus plus : An Advanced Architecture for Medical Image Segmentation
    Jha, Debesh
    Smedsrud, Pia H.
    Riegler, Michael A.
    Johansen, Dag
    de Lange, Thomas
    Halvorsen, Pal
    Johansen, Havard D.
    2019 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2019), 2019, : 225 - 230
  • [6] Efficient Folded Attention for 3D Medical Image Reconstruction and Segmentation
    Zhang, Hang
    Zhang, Jinwei
    Wang, Rongguang
    Zhang, Qihao
    Spincemaille, Pascal
    Nguyen, Thanh D.
    Wang, Yi
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10868 - 10876
  • [7] DeepPrior plus plus : Improving Fast and Accurate 3D Hand Pose Estimation
    Oberweger, Markus
    Lepetit, Vincent
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 585 - 594
  • [8] Target area distillation and section attention segmentation network for accurate 3D medical image segmentation
    Xie, Ruiwei
    Pan, Dan
    Zeng, An
    Xu, Xiaowei
    Wang, Tianchen
    Ullah, Najeeb
    Ji, Yuzhu
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
  • [9] Target area distillation and section attention segmentation network for accurate 3D medical image segmentation
    Ruiwei Xie
    Dan Pan
    An Zeng
    Xiaowei Xu
    Tianchen Wang
    Najeeb Ullah
    Yuzhu Ji
    Health Information Science and Systems, 11
  • [10] A DENSE POINTNET plus plus ARCHITECTURE FOR 3D POINT CLOUD SEMANTIC SEGMENTATION
    Lian, Yanchao
    Feng, Tuo
    Zhou, Jinliu
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 5061 - 5064