MULTI-SCALE CONVOLUTION-TRANSFORMER FUSION NETWORK FOR ENDOSCOPIC IMAGE SEGMENTATION

被引:1
|
作者
Zou, Baosheng [1 ,2 ]
Zhou, Zongguang [3 ,4 ,5 ]
Han, Ying [6 ]
Li, Kang [1 ]
Wang, Guotai [2 ,7 ]
机构
[1] Sichuan Univ, West China Biomed Big Data Ctr, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu, Peoples R China
[3] Sichuan Univ, Div Gastrointestinal Surg, Dept Gen Surg, West China Hosp, Chengdu, Peoples R China
[4] Sichuan Univ, Inst Digest Surg, Chengdu, Peoples R China
[5] Sichuan Univ, State Key Lab Biotherapy, Chengdu, Peoples R China
[6] Sichuan Univ, West China Med Simulat Ctr, West China Hosp, Chengdu, Peoples R China
[7] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical image segmentation; Transformer; Endoscopic image; Image guided surgery;
D O I
10.1109/ISBI53787.2023.10230738
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic semantic segmentation of endoscopic images is an essential part of computer-assisted intervention surgery. Recently, Convolutional Neural Networks (CNNs) have been widely applied to endoscopic image segmentation, but their performance is still limited due to the weak ability to capture global long-range dependencies. This paper proposes a model that combines CNN and Transformer to deal with this problem, and it is named as Multi-scale Convolution-Transformer Fusion Network (MCTFNet) and consists of three components: 1) Multiple-parallel Multi-scale Transformer Convolution (MMTC) modules in parallel branches to extract Multi-scale information, 2) Multi-scale Information Fusion (MIF) module that fuses parallel branch information to allow interaction between different resolutions and 3) High-resolution Information Processing (HIP) module to keep high-resolution features in the image and avoid loss of details. We verified our method on HeiSurF Dataset, and the results show that our method achieved an average Dice of 80.07%, which outperformed state-of-the-art CNNs including HRNet (79.93%) and DeepLabv3 (78.34%). It also outperformed several networks designed for medical image segmentation.
引用
下载
收藏
页数:5
相关论文
共 50 条
  • [1] FusionNet: A Convolution-Transformer Fusion Network for Hyperspectral Image Classification
    Yang, Liming
    Yang, Yihang
    Yang, Jinghui
    Zhao, Ningyuan
    Wu, Ling
    Wang, Liguo
    Wang, Tianrui
    REMOTE SENSING, 2022, 14 (16)
  • [2] Convolution-Transformer Adaptive Fusion Network for Hyperspectral Image Classification
    Li, Jiaju
    Xing, Hanfa
    Ao, Zurui
    Wang, Hefeng
    Liu, Wenkai
    Zhang, Anbing
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [3] ConTrans-Detect: A Multi-Scale Convolution-Transformer Network for DeepFake Video Detection
    Sun, Weirong
    Ma, Yujun
    Zhang, Hong
    Wang, Ruili
    2023 29TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE, M2VIP 2023, 2023,
  • [5] Multi-scale feature pyramid fusion network for medical image segmentation
    Bing Zhang
    Yang Wang
    Caifu Ding
    Ziqing Deng
    Linwei Li
    Zesheng Qin
    Zhao Ding
    Lifeng Bian
    Chen Yang
    International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 353 - 365
  • [6] Multi-scale feature pyramid fusion network for medical image segmentation
    Zhang, Bing
    Wang, Yang
    Ding, Caifu
    Deng, Ziqing
    Li, Linwei
    Qin, Zesheng
    Ding, Zhao
    Bian, Lifeng
    Yang, Chen
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (02) : 353 - 365
  • [7] An image super-resolution network based on multi-scale convolution fusion
    Xin Yang
    Yitian Zhu
    Yingqing Guo
    Dake Zhou
    The Visual Computer, 2022, 38 : 4307 - 4317
  • [8] An image super-resolution network based on multi-scale convolution fusion
    Yang, Xin
    Zhu, Yitian
    Guo, Yingqing
    Zhou, Dake
    VISUAL COMPUTER, 2022, 38 (12): : 4307 - 4317
  • [9] Convolution-Transformer for Image Feature Extraction
    Yin, Lirong
    Wang, Lei
    Lu, Siyu
    Wang, Ruiyang
    Yang, Youshuai
    Yang, Bo
    Liu, Shan
    Alsanad, Ahmed
    Alqahtani, Salman A.
    Yin, Zhengtong
    Li, Xiaolu
    Chen, Xiaobing
    Zheng, Wenfeng
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 141 (01): : 87 - 106
  • [10] Convolution-transformer blend pyramid network for underwater image enhancement ☆
    Ma, Lunpeng
    Hong, Dongyang
    Yin, Shibai
    Deng, Wanqiu
    Yang, Yang
    Yang, Yee-Hong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 101