Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引:0
|
作者
Cheng, Hangyuan [1 ]
Guo, Xiaoxin [2 ]
Yang, Guangqi [1 ]
Chen, Cong [3 ]
Dong, Hongliang [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;
D O I
10.1016/j.compeleceng.2025.110100
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] Semantic Segmentation on Remote Sensing Images with Multi-Scale Feature Fusion
    Zhang J.
    Jin Q.
    Wang H.
    Da C.
    Xiang S.
    Pan C.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2019, 31 (09): : 1509 - 1517
  • [12] MSF-TransUNet: A Multi-Scale Feature Fusion Transformer-Based U-Net for Medical Image Segmentation with Uniform Attention
    Jiang, Ying
    Gong, Lejun
    Huang, Hao
    Qi, Mingming
    Traitement du Signal, 2025, 42 (01) : 531 - 540
  • [13] Segmentation of crack disaster images based on feature extraction enhancement and multi-scale fusion
    Wang, Letian
    Wu, Gengkun
    Tossou, Akpedje Ingrid Hermilda C. F.
    Liang, Zengwei
    Xu, Jie
    EARTH SCIENCE INFORMATICS, 2025, 18 (01)
  • [14] Multi-scale feature pyramid fusion network for medical image segmentation
    Zhang, Bing
    Wang, Yang
    Ding, Caifu
    Deng, Ziqing
    Li, Linwei
    Qin, Zesheng
    Ding, Zhao
    Bian, Lifeng
    Yang, Chen
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (02) : 353 - 365
  • [15] Multi-scale feature pyramid fusion network for medical image segmentation
    Bing Zhang
    Yang Wang
    Caifu Ding
    Ziqing Deng
    Linwei Li
    Zesheng Qin
    Zhao Ding
    Lifeng Bian
    Chen Yang
    International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 353 - 365
  • [16] A Multi-Scale Cross-Fusion Medical Image Segmentation Network Based on Dual-Attention Mechanism Transformer
    Cui, Jianguo
    Wang, Liejun
    Jiang, Shaochen
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [17] Multi-Scale Feature Attention-DEtection TRansformer: Multi-Scale Feature Attention for security check object detection
    Sima, Haifeng
    Chen, Bailiang
    Tang, Chaosheng
    Zhang, Yudong
    Sun, Junding
    IET COMPUTER VISION, 2024, 18 (05) : 613 - 625
  • [18] Feature ensemble network for medical image segmentation with multi-scale atrous transformer
    Gai, Di
    Geng, Yuhan
    Huang, Xia
    Huang, Zheng
    Xiong, Xin
    Zhou, Ruihua
    Wang, Qi
    IET IMAGE PROCESSING, 2024, 18 (11) : 3082 - 3092
  • [19] Dual Attention Based Multi-scale Feature Fusion Network for Indoor RGBD Semantic Segmentation
    Hua, Zhongwei
    Qi, Lizhe
    Du, Daming
    Jiang, Wenxuan
    Sun, Yunquan
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3639 - 3644
  • [20] Self-Attention-based Multi-Scale Feature Fusion Network for Road Ponding Segmentation
    Yang, Shangyu
    Zhang, Ronghui
    Sun, Wencai
    Chen, Shengru
    Ye, Cong
    Wu, Hao
    Li, Mengran
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,