HCTA-Net: A Hybrid CNN-Transformer Attention Network for Surgical Instrument Segmentation

被引:2
|
作者
Yang, Lei [1 ]
Wang, Hongyong [1 ]
Bian, Guibin [1 ,2 ]
Liu, Yanhong [1 ]
机构
[1] Zhengzhou Univ, Sch Elect Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Image segmentation; Feature extraction; Instruments; Transformers; Task analysis; Surgery; Robots; Surgical instruments; Deep architecture; Medical robotics; surgical instrument segmentation; transformer; residual network; deep supervision; FEATURE AGGREGATION; IMAGES;
D O I
10.1109/TMRB.2023.3315479
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Surgical robots nowadays have an increasingly important role in surgery, and the accurate surgical instrument segmentation is one of important prerequisites for their stable operations. However, this task is against with some challenging factors, such as scaling transformation, specular reflection, etc. Recently, transformer has shown their superior segmentation performance in the field of image segmentation, which has a strong remote dependence detection capability. However, it could not well capture locality and translation invariance. In this paper, taking the advantages of transformer and CNN, a hybrid CNN-Transformer attention network, named HCTA-Net, is proposed for automatic surgical instrument segmentation. To be able to better extract more comprehensive feature information from surgical images, a dual-path encoding unit is proposed for effective feature representation of local detail feature and global contexts. Meanwhile, an attention-based feature enhancement (AFE) module is proposed for feature complementary of dual-path encoding networks. In addition, to mitigate the issue of limited processing capacity associated with simple connections, a multi-dimension attention (MDA) module is built to process the intermediate features from three directions, including width, height and space, to filter the interference features while emphasizing the key feature regions of local feature maps. Further, an additive attention enhancement (AAE) module is introduced for further feature enhancement of local feature maps. Finally, in order to obtain more multi-scale global information, a multi-scale context fusion (MCF) module is proposed at the bottleneck layer to obtain different receptive fields to enrich feature representation. Experimental results show that proposed HCTA-Net network can achieve superior segmentation performance on surgical instruments compared to other state-of-the-art (SOTA) segmentation models.
引用
收藏
页码:929 / 944
页数:16
相关论文
共 50 条
  • [31] SaltFormer: A hybrid CNN-Transformer network for automatic salt dome detection
    Li, Yang
    Peng, Suping
    He, Dengke
    Computers and Geosciences, 2025, 195
  • [32] CNN-Transformer hybrid network for concrete dam crack patrol inspection
    Li, Mingchao
    Yuan, Jingyue
    Ren, Qiubing
    Luo, Qiling
    Fu, Junen
    Li, Zhitang
    AUTOMATION IN CONSTRUCTION, 2024, 163
  • [33] UCTNet: Uncertainty-guided CNN-Transformer hybrid networks for medical image segmentation
    Guo, Xiayu
    Lin, Xian
    Yang, Xin
    Yu, Li
    Cheng, Kwang-Ting
    Yan, Zengqiang
    PATTERN RECOGNITION, 2024, 152
  • [34] Polarformer: Optic Disc and Cup Segmentation Using a Hybrid CNN-Transformer and Polar Transformation
    Feng, Yaowei
    Li, Zhendong
    Yang, Dong
    Hu, Hongkai
    Guo, Hui
    Liu, Hao
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [35] CTFNet: CNN-Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation
    Wu H.
    Huang P.
    Zhang M.
    Tang W.
    IEEE Geoscience and Remote Sensing Letters, 2024, 21 : 1 - 5
  • [36] CNN-TransNet: A Hybrid CNN-Transformer Network With Differential Feature Enhancement for Cloud Detection
    Ma, Nan
    Sun, Lin
    He, Yawen
    Zhou, Chenghu
    Dong, Chuanxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [37] Breast Ultrasound Tumor Classification Using a Hybrid Multitask CNN-Transformer Network
    Shareef, Bryar
    Xian, Min
    Vakanski, Aleksandar
    Wang, Haotian
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 344 - 353
  • [38] DACTransNet: A Hybrid CNN-Transformer Network for Histopathological Image Classification of Pancreatic Cancer
    Kou, Yongqing
    Xia, Cong
    Jiao, Yiping
    Zhang, Daoqiang
    Ge, Rongjun
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 422 - 434
  • [39] RoadCT: A Hybrid CNN-Transformer Network for Road Extraction From Satellite Imagery
    Liu, Wei
    Gao, Shufeng
    Zhang, Chun
    Yang, Bijia
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [40] CTIF-Net: A CNN-Transformer Iterative Fusion Network for Salient Object Detection
    Yuan, Junbin
    Zhu, Aiqing
    Xu, Qingzhen
    Wattanachote, Kanoksak
    Gong, Yongyi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3795 - 3805