An effective CNN and Transformer complementary network for medical image segmentation

被引:150
|
作者
Yuan, Feiniu [1 ,3 ,4 ]
Zhang, Zhengxiao [1 ,3 ,4 ]
Fang, Zhijun [2 ]
机构
[1] Shanghai Normal Univ SHNU, Coll Informat Mech & Elect Engn, Shanghai 201418, Peoples R China
[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 201620, Peoples R China
[3] Shanghai Normal Univ, Res Base Online Educ Shanghai Middle & Primary Sch, Shanghai 201418, Peoples R China
[4] Shanghai Normal Univ, Shanghai Engn Res Ctr Intelligent Educ & Bigdata, Shanghai 200234, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; Medical image segmentation; Feature complementary module; Cross -domain fusion; Convolutional Neural Network; ATTENTION;
D O I
10.1016/j.patcog.2022.109228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer network was originally proposed for natural language processing. Due to its powerful representation ability for long-range dependency, it has been extended for vision tasks in recent years. To fully utilize the advantages of Transformers and Convolutional Neural Networks (CNNs), we propose a CNN and Transformer Complementary Network (CTC -Net) for medical image segmentation. We first de-sign two encoders by Swin Transformers and Residual CNNs to produce complementary features in Trans-former and CNN domains, respectively. Then we cross-wisely concatenate these complementary features to propose a Cross-domain Fusion Block (CFB) for effectively blending them. In addition, we compute the correlation between features from the CNN and Transformer domains, and apply channel attention to the self-attention features by Transformers for capturing dual attention information. We incorporate cross-domain fusion, feature correlation and dual attention together to propose a Feature Complementary Module (FCM) for improving the representation ability of features. Finally, we design a Swin Transformer decoder to further improve the representation ability of long-range dependencies, and propose to use skip connections between the Transformer decoded features and the complementary features for extract-ing spatial details, contextual semantics and long-range information. Skip connections are performed in different levels for enhancing multi-scale invariance. Experimental results show that our CTC -Net signifi-cantly surpasses the state-of-the-art image segmentation models based on CNNs, Transformers, and even Transformer and CNN combined models designed for medical image segmentation. It achieves superior performance on different medical applications, including multi-organ segmentation and cardiac segmen-tation. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation
    Guo, Xiayu
    Lin, Xian
    Yang, Xin
    Yu, Li
    Cheng, Kwang-Ting
    Yan, Zengqiang
    PATTERN RECOGNITION, 2024, 152
  • [42] EFFICIENT BINARY CNN FOR MEDICAL IMAGE SEGMENTATION
    Brahma, Kaustav
    Kumar, Viksit
    Samir, Anthony E.
    Chandrakasan, Anantha P.
    Eldar, Yonina C.
    2021 IEEE 18TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2021, : 817 - 821
  • [43] CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation
    Wu, Honglin
    Huang, Peng
    Zhang, Min
    Tang, Wenlong
    Yu, Xinyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [44] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Liu, Hongjia
    Xiao, Yubin
    Wu, Xuan
    Li, Yuanshu
    Zhao, Peng
    Liang, Yanchun
    Wang, Liupu
    Zhou, You
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2851 - 2868
  • [45] STCNet: Alternating CNN and improved transformer network for COVID-19 CT image segmentation
    Geng, Peng
    Tan, Ziye
    Wang, Yimeng
    Jia, Wenran
    Zhang, Ying
    Yan, Hongjiang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 93
  • [46] CTFNet: CNN-Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation
    Wu H.
    Huang P.
    Zhang M.
    Tang W.
    IEEE Geoscience and Remote Sensing Letters, 2024, 21 : 1 - 5
  • [47] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Hongjia Liu
    Yubin Xiao
    Xuan Wu
    Yuanshu Li
    Peng Zhao
    Yanchun Liang
    Liupu Wang
    You Zhou
    Complex & Intelligent Systems, 2024, 10 : 2851 - 2868
  • [48] TD-Net:unsupervised medical image registration network based on Transformer and CNN
    Song, Lei
    Liu, Guixia
    Ma, Mingrui
    APPLIED INTELLIGENCE, 2022, 52 (15) : 18201 - 18209
  • [49] FFSwinNet: CNN-Transformer Combined Network With FFT for Shale Core SEM Image Segmentation
    Feng, Yilong
    Jia, Lijuan
    Zhang, Jinchuan
    Chen, Junqi
    IEEE ACCESS, 2024, 12 : 73021 - 73032
  • [50] A CNN-transformer-based unsupervised aware hierarchical network for medical image registration
    Fang, Bo
    Wang, Lisheng
    Electronics Letters, 2024, 60 (24)