Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引:0
|
作者
Cheng, Hangyuan [1 ]
Guo, Xiaoxin [2 ]
Yang, Guangqi [1 ]
Chen, Cong [3 ]
Dong, Hongliang [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;
D O I
10.1016/j.compeleceng.2025.110100
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] SSD with multi-scale feature fusion and attention mechanism
    Liu, Qiang
    Dong, Lijun
    Zeng, Zhigao
    Zhu, Wenqiu
    Zhu, Yanhui
    Meng, Chen
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [32] Construction of multi-scale feature fusion segmentation model of MRI knee images based on dual attention mechanism weighted aggregation
    Gai, Xinghui
    Cai, Huifang
    Wang, Junying
    Li, Xinyue
    Sui, Yan
    Liu, Kang
    Yang, Dewu
    TECHNOLOGY AND HEALTH CARE, 2024, 32 : S277 - S286
  • [33] SSD with multi-scale feature fusion and attention mechanism
    Qiang Liu
    Lijun Dong
    Zhigao Zeng
    Wenqiu Zhu
    Yanhui Zhu
    Chen Meng
    Scientific Reports, 13 (1)
  • [34] FEATURE EXTRACTION OF GYMNASTICS IMAGES BASED ON MULTI-SCALE FEATURE FUSION ALGORITHM
    Tian, Kun
    Xia, Qionghua
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2024, 25 (05): : 3394 - 3407
  • [35] Semantic Segmentation Method Based on Residual and Multi-Scale Feature Fusion
    Xiu, Chunbo
    Su, Huan
    Su, Xuemiao
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 2078 - 2083
  • [36] Text Detection Algorithm Based on Multi-Scale Attention Feature Fusion
    She, Xiangyang
    Liu, Zhe
    Dong, Lihong
    Computer Engineering and Applications, 2024, 60 (01) : 198 - 206
  • [37] Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation
    Rahman, Md Mostafijur
    Marculescu, Radu
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 1526 - 1544
  • [38] MSRT: MULTI-SCALE SPATIAL REGULARIZATION TRANSFORMER FOR MULTI-LABEL CLASSIFICATION IN CALCANEUS RADIOGRAPH
    Mu, Yuxuan
    Zhao, He
    Guo, Jia
    Li, Huiqi
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [39] Accurate Retrieval of Multi-scale Clothing Images Based on Multi-feature Fusion
    Wang Z.-W.
    Pu Y.-Y.
    Wang X.
    Zhao Z.-P.
    Xu D.
    Qian W.-H.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (04): : 740 - 754
  • [40] A Multi-Feature Fusion and Attention Network for Multi-Scale Object Detection in Remote Sensing Images
    Cheng, Yong
    Wang, Wei
    Zhang, Wenjie
    Yang, Ling
    Wang, Jun
    Ni, Huan
    Guan, Tingzhao
    He, Jiaxin
    Gu, Yakang
    Tran, Ngoc Nguyen
    REMOTE SENSING, 2023, 15 (08)