MMCA-NET: A Multimodal Cross Attention Transformer Network for Nasopharyngeal Carcinoma Tumor Segmentation Based on a Total-Body PET/CT System

被引:5
|
作者
Zhao, Wenjie [1 ,2 ]
Huang, Zhenxing [1 ,2 ]
Tang, Si [3 ]
Li, Wenbo [1 ,2 ]
Gao, Yunlong [1 ,2 ]
Hu, Yingying [3 ]
Fan, Wei [3 ]
Cheng, Chuanli [1 ,2 ]
Yang, Yongfeng [1 ,4 ]
Zheng, Hairong [1 ,4 ]
Liang, Dong [1 ,4 ]
Hu, Zhanli [1 ,4 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Lauterbur Res Ctr Biomed Imaging, Shenzhen 518055, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[3] Sun Yat Sen Univ, Dept Nucl Med, Canc Ctr, Guangzhou 510060, Peoples R China
[4] Chinese Acad Sci, Key Lab Biomed Imaging Sci & Syst, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Image segmentation; Computed tomography; Transformers; Decoding; Cancer; Deep learning; Nasopharyngeal carcinoma segmentation; multimodal PET/CT; transformer; cross attention;
D O I
10.1109/JBHI.2024.3405993
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nasopharyngeal carcinoma (NPC) is a malignant tumor primarily treated by radiotherapy. Accurate delineation of the target tumor is essential for improving the effectiveness of radiotherapy. However, the segmentation performance of current models is unsatisfactory due to poor boundaries, large-scale tumor volume variation, and the labor-intensive nature of manual delineation for radiotherapy. In this paper, MMCA-Net, a novel segmentation network for NPC using PET/CT images that incorporates an innovative multimodal cross attention transformer (MCA-Transformer) and a modified U-Net architecture, is introduced to enhance modal fusion by leveraging cross-attention mechanisms between CT and PET data. Our method, tested against ten algorithms via fivefold cross-validation on samples from Sun Yat-sen University Cancer Center and the public HECKTOR dataset, consistently topped all four evaluation metrics with average Dice similarity coefficients of 0.815 and 0.7944, respectively. Furthermore, ablation experiments were conducted to demonstrate the superiority of our method over multiple baseline and variant techniques. The proposed method has promising potential for application in other tasks.
引用
收藏
页码:5447 / 5458
页数:12
相关论文
共 12 条
  • [11] C2MA-Net: Cross-Modal Cross-Attention Network for Acute Ischemic Stroke Lesion Segmentation Based on CT Perfusion Scans
    Shi, Tianyu
    Jiang, Huiyan
    Zheng, Bin
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2022, 69 (01) : 108 - 118
  • [12] CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation
    Sun, Kangkang
    Ding, Jiangyi
    Li, Qixuan
    Chen, Wei
    Zhang, Heng
    Sun, Jiawei
    Jiao, Zhuqing
    Ni, Xinye
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (07) : 4579 - 4604