Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引：0

作者：

Cheng, Hangyuan ^{[1
]}

Guo, Xiaoxin ^{[2
]}

Yang, Guangqi ^{[1
]}

Chen, Cong ^{[3
]}

Dong, Hongliang ^{[1
]}

机构：

[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China

[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China

[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2025年 / 123卷

基金：

中国国家自然科学基金;

关键词：

Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;

D O I：

10.1016/j.compeleceng.2025.110100

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.

引用

页数：17

共 50 条

[11] Semantic Segmentation on Remote Sensing Images with Multi-Scale Feature Fusion
Zhang J.
Jin Q.
Wang H.
Da C.
Xiang S.
Pan C.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2019, 31 (09): : 1509 - 1517
[12] MSF-TransUNet: A Multi-Scale Feature Fusion Transformer-Based U-Net for Medical Image Segmentation with Uniform Attention
Jiang, Ying
Gong, Lejun
Huang, Hao
Qi, Mingming
Traitement du Signal, 2025, 42 (01) : 531 - 540
[13] Segmentation of crack disaster images based on feature extraction enhancement and multi-scale fusion
Wang, Letian
Wu, Gengkun
Tossou, Akpedje Ingrid Hermilda C. F.
Liang, Zengwei
Xu, Jie
EARTH SCIENCE INFORMATICS, 2025, 18 (01)
[14] Multi-scale feature pyramid fusion network for medical image segmentation
Zhang, Bing
Wang, Yang
Ding, Caifu
Deng, Ziqing
Li, Linwei
Qin, Zesheng
Ding, Zhao
Bian, Lifeng
Yang, Chen
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (02) : 353 - 365
[15] Multi-scale feature pyramid fusion network for medical image segmentation
Bing Zhang
Yang Wang
Caifu Ding
Ziqing Deng
Linwei Li
Zesheng Qin
Zhao Ding
Lifeng Bian
Chen Yang
International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 353 - 365
[16] A Multi-Scale Cross-Fusion Medical Image Segmentation Network Based on Dual-Attention Mechanism Transformer
Cui, Jianguo
Wang, Liejun
Jiang, Shaochen
APPLIED SCIENCES-BASEL, 2023, 13 (19):
[17] Multi-Scale Feature Attention-DEtection TRansformer: Multi-Scale Feature Attention for security check object detection
Sima, Haifeng
Chen, Bailiang
Tang, Chaosheng
Zhang, Yudong
Sun, Junding
IET COMPUTER VISION, 2024, 18 (05) : 613 - 625
[18] Feature ensemble network for medical image segmentation with multi-scale atrous transformer
Gai, Di
Geng, Yuhan
Huang, Xia
Huang, Zheng
Xiong, Xin
Zhou, Ruihua
Wang, Qi
IET IMAGE PROCESSING, 2024, 18 (11) : 3082 - 3092
[19] Dual Attention Based Multi-scale Feature Fusion Network for Indoor RGBD Semantic Segmentation
Hua, Zhongwei
Qi, Lizhe
Du, Daming
Jiang, Wenxuan
Sun, Yunquan
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3639 - 3644
[20] Self-Attention-based Multi-Scale Feature Fusion Network for Road Ponding Segmentation
Yang, Shangyu
Zhang, Ronghui
Sun, Wencai
Chen, Shengru
Ye, Cong
Wu, Hao
Li, Mengran
2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,

← 1 2 3 4 5 →