An effective CNN and Transformer complementary network for medical image segmentation

被引:150
|
作者
Yuan, Feiniu [1 ,3 ,4 ]
Zhang, Zhengxiao [1 ,3 ,4 ]
Fang, Zhijun [2 ]
机构
[1] Shanghai Normal Univ SHNU, Coll Informat Mech & Elect Engn, Shanghai 201418, Peoples R China
[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 201620, Peoples R China
[3] Shanghai Normal Univ, Res Base Online Educ Shanghai Middle & Primary Sch, Shanghai 201418, Peoples R China
[4] Shanghai Normal Univ, Shanghai Engn Res Ctr Intelligent Educ & Bigdata, Shanghai 200234, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; Medical image segmentation; Feature complementary module; Cross -domain fusion; Convolutional Neural Network; ATTENTION;
D O I
10.1016/j.patcog.2022.109228
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer network was originally proposed for natural language processing. Due to its powerful representation ability for long-range dependency, it has been extended for vision tasks in recent years. To fully utilize the advantages of Transformers and Convolutional Neural Networks (CNNs), we propose a CNN and Transformer Complementary Network (CTC -Net) for medical image segmentation. We first de-sign two encoders by Swin Transformers and Residual CNNs to produce complementary features in Trans-former and CNN domains, respectively. Then we cross-wisely concatenate these complementary features to propose a Cross-domain Fusion Block (CFB) for effectively blending them. In addition, we compute the correlation between features from the CNN and Transformer domains, and apply channel attention to the self-attention features by Transformers for capturing dual attention information. We incorporate cross-domain fusion, feature correlation and dual attention together to propose a Feature Complementary Module (FCM) for improving the representation ability of features. Finally, we design a Swin Transformer decoder to further improve the representation ability of long-range dependencies, and propose to use skip connections between the Transformer decoded features and the complementary features for extract-ing spatial details, contextual semantics and long-range information. Skip connections are performed in different levels for enhancing multi-scale invariance. Experimental results show that our CTC -Net signifi-cantly surpasses the state-of-the-art image segmentation models based on CNNs, Transformers, and even Transformer and CNN combined models designed for medical image segmentation. It achieves superior performance on different medical applications, including multi-organ segmentation and cardiac segmen-tation. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] CPFTransformer: transformer fusion context pyramid medical image segmentation network
    Li, Jiao
    Ye, Jinyu
    Zhang, Ruixin
    Wu, Yue
    Berhane, Gebremedhin Samuel
    Deng, Hongxia
    Shi, Hong
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [32] CTRANSNET: CONVOLUTIONAL NEURAL NETWORK COMBINED WITH TRANSFORMER FOR MEDICAL IMAGE SEGMENTATION
    Zhang, Zhixin
    Jiang, Shuhao
    Pan, Xuhua
    COMPUTING AND INFORMATICS, 2023, 42 (02) : 392 - 410
  • [33] MultiTrans: Multi-branch transformer network for medical image segmentation
    Zhang, Yanhua
    Balestra, Gabriella
    Zhang, Ke
    Wang, Jingyu
    Rosati, Samanta
    Giannini, Valentina
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 254
  • [34] MFH-Net: A Hybrid CNN-Transformer Network Based Multi-Scale Fusion for Medical Image Segmentation
    Wang, Ying
    Zhang, Meng
    Liang, Jian'an
    Liang, Meiyan
    International Journal of Imaging Systems and Technology, 2024, 34 (06)
  • [35] HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation
    Yu, Zhihong
    Lee, Feifei
    Chen, Qiu
    APPLIED INTELLIGENCE, 2023, 53 (17) : 19990 - 20006
  • [36] HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation
    Zhihong Yu
    Feifei Lee
    Qiu Chen
    Applied Intelligence, 2023, 53 : 19990 - 20006
  • [37] MS-TCNet: An effective Transformer-CNN combined network using multi-scale feature learning for 3D medical image segmentation
    Ao, Yu
    Shi, Weili
    Ji, Bai
    Miao, Yu
    He, Wei
    Jiang, Zhengang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 170
  • [38] LATrans-Unet: Improving CNN-Transformer with Location Adaptive for Medical Image Segmentation
    Lin, Qiqin
    Yao, Junfeng
    Hong, Qingqi
    Cao, Xianpeng
    Zhou, Rongzhou
    Xie, Weixing
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XIII, 2024, 14437 : 223 - 234
  • [39] Semi-Supervised Medical Image Segmentation via Cross Teaching between CNN and Transformer
    Luo, Xiangde
    Hu, Minhao
    Song, Tao
    Wang, Guotai
    Zhang, Shaoting
    INTERNATIONAL CONFERENCE ON MEDICAL IMAGING WITH DEEP LEARNING, VOL 172, 2022, 172 : 820 - 833
  • [40] ScribFormer: Transformer Makes CNN Work Better for Scribble-Based Medical Image Segmentation
    Li, Zihan
    Zheng, Yuan
    Shan, Dandan
    Yang, Shuzhou
    Li, Qingde
    Wang, Beizhan
    Zhang, Yuanting
    Hong, Qingqi
    Shen, Dinggang
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (06) : 2254 - 2265