Comprehensive attention transformer for multi-label segmentation of medical images based on multi-scale feature fusion

被引：0

作者：

Cheng, Hangyuan ^{[1
]}

Guo, Xiaoxin ^{[2
]}

Yang, Guangqi ^{[1
]}

Chen, Cong ^{[3
]}

Dong, Hongliang ^{[1
]}

机构：

[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China

[2] Jilin Univ, Coll Comp Sci & Technol, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China

[3] Jilin Univ, Coll Software, Changchun 130012, Peoples R China

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2025年 / 123卷

基金：

中国国家自然科学基金;

关键词：

Comprehensive attention; Mixed-attention transformer; Multi-scale aggregation; Transformer; UNET PLUS PLUS; NETWORKS;

D O I：

10.1016/j.compeleceng.2025.110100

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transformer-based models often neglect the convolutional networks' capability to extract local features. Most U-shaped models only utilize the multi-scale features from the encoder's output and focus solely on the final layer of the decoder's output. Moreover, most typical skip connections can be configured only between the encoder and the decoder in the same layer without any potential optimization An innovative comprehensive attention Transformer (CAFormer) is proposed to address the issue of long-range relations and local features in multi- label segmentation of medical images, which adopts a U-shaped hierarchical encoder-decoder structure. A mixed-attention Transformer (MATrans) module is devised to extract multi-scale features from the encoders and establish multiple encoder-decoder connections using channel- wise cross-attention and self-attention, which can automatically configure the optimal skip connections. During the upsampling, a channel-based feature fusion module is proposed to focus on the important channel-based features. A comprehensive attention module (CAM) is designed to extract global context and local features by integrating an enhanced Transformer module, a channel and spatial attention module. Additionally, the encoder's multi-scale features undergo the hierarchical prediction link through the proposed multi-scale aggregation module (MSAM) for the final prediction rather than directly using the output of the last layer of the decoder as the segmentation outcome. The experiments show that the CAFormer is efficient, robust, achieves the DSC of 82.26 and the HD of 15.26 on the Synapse dataset, and outperforms other state-of-the-art models. The code and pre-trained models are available at https://github.com/ zed-kingc/CAFormer.

引用

页数：17

共 50 条

[1] Semantic Segmentation of Remote Sensing Images Based on Dual Attention and Multi-scale Feature Fusion
Weng, Mengqian
Hu, Zhibo
Xie, Xiaopeng
Li, Yunhong
Hu, Lei
TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
[2] A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion
Yu, Jing
Li, Zhengping
Xu, Chao
Feng, Bo
ELECTRONICS, 2022, 11 (16)
[3] MAXFormer: Enhanced transformer for medical image segmentation with multi-attention and multi-scale features fusion
Liang, Zhiwei
Zhao, Kui
Liang, Gang
Li, Siyu
Wu, Yifei
Zhou, Yiping
KNOWLEDGE-BASED SYSTEMS, 2023, 280
[4] A Road Crack Segmentation Method Based on Transformer and Multi-Scale Feature Fusion
Xu, Yang
Xia, Yonghua
Zhao, Quai
Yang, Kaihua
Li, Qiang
ELECTRONICS, 2024, 13 (12)
[5] Collaborative Attention Guided Multi-Scale Feature Fusion Network for Medical Image Segmentation
Xu, Zhenghua
Tian, Biao
Liu, Shijie
Wang, Xiangtao
Yuan, Di
Gu, Junhua
Chen, Junyang
Lukasiewicz, Thomas
Leung, Victor C. M.
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 1857 - 1871
[6] Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge
Lijie Zhao
Yingying Zhang
Guogang Wang
Mingzhong Huang
Qichun Zhang
Hamid Reza Karimi
Signal, Image and Video Processing, 2024, 18 : 1241 - 1248
[7] Multi-scale feature flow alignment fusion with Transformer for the microscopic images segmentation of activated sludge
Zhao, Lijie
Zhang, Yingying
Wang, Guogang
Huang, Mingzhong
Zhang, Qichun
Karimi, Hamid Reza
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1241 - 1248
[8] Multi-scale feature fusion network with local attention for lung segmentation
Xie, Yinghua
Zhou, Yuntong
Wang, Chen
Ma, Yanshan
Yang, Ming
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119
[9] Multi-Scale Cross-Modal Spatial Attention Fusion for Multi-label Image Recognition
Li, Junbing
Zhang, Changqing
Wang, Xueman
Du, Ling
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 736 - 747
[10] Research on Image Segmentation Method Based on Multi-Scale Feature Fusion and Dual Attention
Wang, Zhihong
Wang, Chaoying
Li, Jianxin
Wu, Tianxiang
Li, Jiajun
Huang, Hongxing
Jiang, Lai
Journal of Computers (Taiwan), 2024, 35 (06) : 45 - 54

← 1 2 3 4 5 →