Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引:0
|
作者
Cheng, Hangyuan [1 ]
Guo, Xiaoxin [2 ]
Yang, Guangqi [1 ]
Chen, Cong [3 ]
Dong, Hongliang [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China
基金
中国国家自然科学基金;
关键词
Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;
D O I
10.1016/j.compeleceng.2025.110100
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Semantic Segmentation of Remote Sensing Images Based on Dual Attention and Multi-scale Feature Fusion
    Weng, Mengqian
    Hu, Zhibo
    Xie, Xiaopeng
    Li, Yunhong
    Hu, Lei
    TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [2] A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion
    Yu, Jing
    Li, Zhengping
    Xu, Chao
    Feng, Bo
    ELECTRONICS, 2022, 11 (16)
  • [3] MAXFormer: Enhanced transformer for medical image segmentation with multi-attention and multi-scale features fusion
    Liang, Zhiwei
    Zhao, Kui
    Liang, Gang
    Li, Siyu
    Wu, Yifei
    Zhou, Yiping
    KNOWLEDGE-BASED SYSTEMS, 2023, 280
  • [4] A Road Crack Segmentation Method Based on Transformer and Multi-Scale Feature Fusion
    Xu, Yang
    Xia, Yonghua
    Zhao, Quai
    Yang, Kaihua
    Li, Qiang
    ELECTRONICS, 2024, 13 (12)
  • [5] Collaborative Attention Guided Multi-Scale Feature Fusion Network for Medical Image Segmentation
    Xu, Zhenghua
    Tian, Biao
    Liu, Shijie
    Wang, Xiangtao
    Yuan, Di
    Gu, Junhua
    Chen, Junyang
    Lukasiewicz, Thomas
    Leung, Victor C. M.
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 1857 - 1871
  • [6] Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge
    Lijie Zhao
    Yingying Zhang
    Guogang Wang
    Mingzhong Huang
    Qichun Zhang
    Hamid Reza Karimi
    Signal, Image and Video Processing, 2024, 18 : 1241 - 1248
  • [7] Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge
    Zhao, Lijie
    Zhang, Yingying
    Wang, Guogang
    Huang, Mingzhong
    Zhang, Qichun
    Karimi, Hamid Reza
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1241 - 1248
  • [8] Multi-scale feature fusion network with local attention for lung segmentation
    Xie, Yinghua
    Zhou, Yuntong
    Wang, Chen
    Ma, Yanshan
    Yang, Ming
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119
  • [9] Multi-Scale Cross-Modal Spatial Attention Fusion for Multi-label Image Recognition
    Li, Junbing
    Zhang, Changqing
    Wang, Xueman
    Du, Ling
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 736 - 747
  • [10] Research on Image Segmentation Method Based on Multi-Scale Feature Fusion and Dual Attention
    Wang, Zhihong
    Wang, Chaoying
    Li, Jianxin
    Wu, Tianxiang
    Li, Jiajun
    Huang, Hongxing
    Jiang, Lai
    Journal of Computers (Taiwan), 2024, 35 (06) : 45 - 54