CoST-UNet: Convolution and swin transformer based deep learning architecture for cardiac segmentation

被引:0
|
作者
Islam, Md Rabiul [1 ]
Qaraqe, Marwa [2 ]
Serpedin, Erchin [1 ]
机构
[1] Texas A&M Univ, Elect & Comp Engn, College Stn, TX 77843 USA
[2] Hamad Bin Khalifa Univ, Coll Sci & Engn, Informat & Comp Technol, Doha, Qatar
关键词
Segmentation; Echocardiogram; Vision transformer; CNN-transformer; Local-global; INDEX;
D O I
10.1016/j.bspc.2024.106633
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Automatic segmentation of two-dimensional (2D) echocardiogram is beneficial for heart disease diagnosis and assessment. Convolutional Neural Network (CNN) based U-shaped architectures such as UNet have shown remarkable success for medical images segmentation. UNet generally exhibits limitations for seizing long-range dependencies due to the intrinsic locality of the convolution operation. On the contrary, transformer models can capture global-level information using the multi-head attention mechanism. Taken separately these models exhibit limited localization abilities due to insufficient low-level details. To overcome these limitations, this paper proposes the novel vision transformer CoST-UNet (Convolution and Swin Transformer-based U-shaped Network) architecture that incorporates CNN to leverage spatial information from images in the upper layers and transformer to emphasize global contextual insight in the deeper levels. Unlike existing hybrid models like TransUNet and UNETR, the transformer block of the proposed model employs a Swin Transformer backbone, which ensures linear computational complexity relative to image size. Furthermore, the primary barrier to improving the performance of the transformers, which is the lack of medical images, is effectively addressed by incorporating two convolution layers at the network's uppermost level. The experimental results demonstrate that the model achieved state-of-the-art performance on the ultrasound-based CAMUS dataset (by achieving mean Dice Similarity Coefficients of 0.925, 0.851, and 0.895 for segmenting LV endo , LV epi , and LA, respectively, from apical 4CH echocardiograms), as well as competitive results for MRI-based ACDC datasets, due to its effective capture of local and global context.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Transformer-Based Deep Learning Architecture for Improved Cardiac Substructure Segmentation
    Summerfield, N.
    Qiu, J.
    Hossain, S.
    Dong, M.
    Glide-Hurst, C.
    [J]. MEDICAL PHYSICS, 2022, 49 (06) : E525 - E526
  • [2] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
    Fan, Lili
    Zhou, Yu
    Liu, Hongmei
    Li, Yunjie
    Cao, Dongpu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
  • [3] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
    He, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Zhang, Di
    Yao, Rui
    Xue, Yong
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [4] STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multiscale MLP for Medical Image Segmentation
    Shi, Lei
    Gao, Tianyu
    Zhang, Zheng
    Zhang, Junxing
    [J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 2003 - 2008
  • [5] SWUNet: Swin Transformer Based UNet for Hyperspectral Reconstruction
    Hussain, Sadia
    Lall, Brejesh
    [J]. Workshop on Hyperspectral Image and Signal Processing, Evolution in Remote Sensing, 2023,
  • [6] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Yimin Cai
    Yuqing Long
    Zhenggong Han
    Mingkun Liu
    Yuchen Zheng
    Wei Yang
    Liming Chen
    [J]. BMC Medical Informatics and Decision Making, 23
  • [7] Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution
    Cai, Yimin
    Long, Yuqing
    Han, Zhenggong
    Liu, Mingkun
    Zheng, Yuchen
    Yang, Wei
    Chen, Liming
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [8] DSTUNET: UNET WITH EFFICIENT DENSE SWIN TRANSFORMER PATHWAY FOR MEDICAL IMAGE SEGMENTATION
    Cai, Zhuotong
    Xin, Jingmin
    Shi, Peiwen
    Wu, Jiayi
    Zheng, Nanning
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [9] Deep learning-based bubble detection with swin transformer
    Uesawa, Shinichiro
    Yoshida, Hiroyuki
    [J]. JOURNAL OF NUCLEAR SCIENCE AND TECHNOLOGY, 2024, 61 (11) : 1438 - 1452
  • [10] Ground-based image deconvolution with Swin Transformer UNet
    Akhaury, U.
    Jablonka, P.
    Starck, J.-L.
    Courbin, F.
    [J]. Astronomy and Astrophysics, 2024, 688