HCTA-Net: A Hybrid CNN-Transformer Attention Network for Surgical Instrument Segmentation

被引:2
|
作者
Yang, Lei [1 ]
Wang, Hongyong [1 ]
Bian, Guibin [1 ,2 ]
Liu, Yanhong [1 ]
机构
[1] Zhengzhou Univ, Sch Elect Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Image segmentation; Feature extraction; Instruments; Transformers; Task analysis; Surgery; Robots; Surgical instruments; Deep architecture; Medical robotics; surgical instrument segmentation; transformer; residual network; deep supervision; FEATURE AGGREGATION; IMAGES;
D O I
10.1109/TMRB.2023.3315479
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Surgical robots nowadays have an increasingly important role in surgery, and the accurate surgical instrument segmentation is one of important prerequisites for their stable operations. However, this task is against with some challenging factors, such as scaling transformation, specular reflection, etc. Recently, transformer has shown their superior segmentation performance in the field of image segmentation, which has a strong remote dependence detection capability. However, it could not well capture locality and translation invariance. In this paper, taking the advantages of transformer and CNN, a hybrid CNN-Transformer attention network, named HCTA-Net, is proposed for automatic surgical instrument segmentation. To be able to better extract more comprehensive feature information from surgical images, a dual-path encoding unit is proposed for effective feature representation of local detail feature and global contexts. Meanwhile, an attention-based feature enhancement (AFE) module is proposed for feature complementary of dual-path encoding networks. In addition, to mitigate the issue of limited processing capacity associated with simple connections, a multi-dimension attention (MDA) module is built to process the intermediate features from three directions, including width, height and space, to filter the interference features while emphasizing the key feature regions of local feature maps. Further, an additive attention enhancement (AAE) module is introduced for further feature enhancement of local feature maps. Finally, in order to obtain more multi-scale global information, a multi-scale context fusion (MCF) module is proposed at the bottleneck layer to obtain different receptive fields to enrich feature representation. Experimental results show that proposed HCTA-Net network can achieve superior segmentation performance on surgical instruments compared to other state-of-the-art (SOTA) segmentation models.
引用
收藏
页码:929 / 944
页数:16
相关论文
共 50 条
  • [1] TFCNs: A CNN-Transformer Hybrid Network for Medical Image Segmentation
    Li, Zihan
    Li, Dihan
    Xu, Cangbai
    Wang, Weice
    Hong, Qingqi
    Li, Qingde
    Tian, Jie
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 781 - 792
  • [2] TACT: Text attention based CNN-Transformer network for polyp segmentation
    Zhao, Yiyang
    Li, Jinjiang
    Hua, Zhen
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (02)
  • [3] HTC-Net: A hybrid CNN-transformer framework for medical image segmentation
    Tang, Hui
    Chen, Yuanbin
    Wang, Tao
    Zhou, Yuanbo
    Zhao, Longxuan
    Gao, Qinquan
    Du, Min
    Tan, Tao
    Zhang, Xinlin
    Tong, Tong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [4] HAU-Net: Hybrid CNN-transformer for breast ultrasound image segmentation
    Zhang, Huaikun
    Lian, Jing
    Yi, Zetong
    Wu, Ruichao
    Lu, Xiangyu
    Ma, Pei
    Ma, Yide
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 87
  • [5] HCTNet: A hybrid CNN-transformer network for breast ultrasound image segmentation
    He, Qiqi
    Yang, Qiuju
    Xie, Minghao
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 155
  • [6] CSU-Net: A CNN-Transformer Parallel Network for Multimodal Brain Tumour Segmentation
    Chen, Yu
    Yin, Ming
    Li, Yu
    Cai, Qian
    ELECTRONICS, 2022, 11 (14)
  • [7] MFH-Net: A Hybrid CNN-Transformer Network Based Multi-Scale Fusion for Medical Image Segmentation
    Wang, Ying
    Zhang, Meng
    Liang, Jian'an
    Liang, Meiyan
    International Journal of Imaging Systems and Technology, 2024, 34 (06)
  • [8] HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation
    Yu, Zhihong
    Lee, Feifei
    Chen, Qiu
    APPLIED INTELLIGENCE, 2023, 53 (17) : 19990 - 20006
  • [9] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Liu, Hongjia
    Xiao, Yubin
    Wu, Xuan
    Li, Yuanshu
    Zhao, Peng
    Liang, Yanchun
    Wang, Liupu
    Zhou, You
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2851 - 2868
  • [10] CTHD-Net: CNN-Transformer hybrid dehazing network via residual global attention and gated boosting strategy
    Li, Haiyan
    Qiao, Renchao
    Yu, Pengfei
    Li, Haijiang
    Tan, Mingchuan
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99