Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引:0
|
作者
Cheng, Hangyuan [1 ]
Guo, Xiaoxin [2 ]
Yang, Guangqi [1 ]
Chen, Cong [3 ]
Dong, Hongliang [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;
D O I
10.1016/j.compeleceng.2025.110100
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Label-aware Attention Network with Multi-scale Boosting for Medical Image Segmentation
    Wang, Linbo
    Xu, Peng
    Cao, Xianfeng
    Nappi, Michele
    Wan, Shaohua
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [42] A Novel Hybridoma Cell Segmentation Method Based on Multi-Scale Feature Fusion and Dual Attention Network
    Lu, Jianfeng
    Ren, Hangpeng
    Shi, Mengtao
    Cui, Chen
    Zhang, Shanqing
    Emam, Mahmoud
    Li, Li
    ELECTRONICS, 2023, 12 (04)
  • [43] A Deep Segmentation Network of Multi-Scale Feature Fusion Based on Attention Mechanism for IVOCT Lumen Contour
    Huang, Chenxi
    Lan, Yisha
    Xu, Gaowei
    Zhai, Xiaojun
    Wu, Jipeng
    Lin, Fan
    Zeng, Nianyin
    Hong, Qingqi
    Ng, E. Y. K.
    Peng, Yonghong
    Chen, Fei
    Zhang, Guokai
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (01) : 62 - 69
  • [44] MFA-UNet: a vessel segmentation method based on multi-scale feature fusion and attention module
    Cao, Juan
    Chen, Jiaran
    Gu, Yuanyuan
    Liu, Jinjia
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [45] Semantic Segmentation Network of Pathological Images of Liver Tissue Based on Multi-scale Feature and Attention Mechanism
    Zhang A.
    Kang Y.
    Wu Z.
    Cui L.
    Bu Q.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (04): : 375 - 384
  • [46] Fusion multi-scale Transformer skin lesion segmentation algorithm
    Liang L.-M.
    Zhou L.-S.
    Yin J.
    Sheng X.-Q.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (04): : 1086 - 1098
  • [47] A multi-scale semantic attention representation for multi-label image recognition with graph networks
    Liang, Jun
    Xu, Feiteng
    Yu, Songsen
    Neurocomputing, 2022, 491 : 14 - 23
  • [48] A multi-scale semantic attention representation for multi-label image recognition with graph networks
    Liang, Jun
    Xu, Feiteng
    Yu, Songsen
    NEUROCOMPUTING, 2022, 491 : 14 - 23
  • [49] Multi-scale Adaptive Feature Fusion Network for Semantic Segmentation in Remote Sensing Images
    Shang, Ronghua
    Zhang, Jiyu
    Jiao, Licheng
    Li, Yangyang
    Marturi, Naresh
    Stolkin, Rustam
    REMOTE SENSING, 2020, 12 (05)
  • [50] Multi-Scale Feature Fusion Attention Network for Building Extraction in Remote Sensing Images
    Liu, Jia
    Gu, Hang
    Li, Zuhe
    Chen, Hongyang
    Chen, Hao
    ELECTRONICS, 2024, 13 (05)