Hybrid CNN-transformer network for interactive learning of challenging musculoskeletal images

被引:0
|
作者
Bi, Lei [1 ,2 ]
Buehner, Ulrich [3 ]
Fu, Xiaohang
Williamson, Tom [3 ,4 ]
Choong, Peter [5 ]
Kim, Jinman [2 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Translat Med, Natl Ctr Translat Med, Shanghai, Peoples R China
[2] Univ Sydney, Sch Comp Sci, Sydney, NSW, Australia
[3] Stryker Corp, Kalamazoo, MI USA
[4] RMIT Univ, Ctr Addit Mfg, Sch Engn, Melbourne, Vic, Australia
[5] Univ Melbourne, Dept Surg, Melbourne, Vic, Australia
关键词
Musculoskeletal; Interactive segmentation; Transformer network; THORACOLUMBAR SPINE; CHEST RADIOGRAPHS; SEGMENTATION; METASTASES; CLASSIFICATION;
D O I
10.1016/j.cmpb.2023.107875
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background and objectives: Segmentation of regions of interest (ROIs) such as tumors and bones plays an essential role in the analysis of musculoskeletal (MSK) images. Segmentation results can help with orthopaedic surgeons in surgical outcomes assessment and patient's gait cycle simulation. Deep learning-based automatic segmentation methods, particularly those using fully convolutional networks (FCNs), are considered as the state-of-the-art. However, in scenarios where the training data is insufficient to account for all the variations in ROIs, these methods struggle to segment the challenging ROIs that with less common image characteristics. Such characteristics might include low contrast to the background, inhomogeneous textures, and fuzzy boundaries. Methods: we propose a hybrid convolutional neural network - transformer network (HCTN) for semi-automatic segmentation to overcome the limitations of segmenting challenging MSK images. Specifically, we propose to fuse user-inputs (manual, e.g., mouse clicks) with high-level semantic image features derived from the neural network (automatic) where the user-inputs are used in an interactive training for uncommon image characteristics. In addition, we propose to leverage the transformer network (TN) - a deep learning model designed for handling sequence data, in together with features derived from FCNs for segmentation; this addresses the limitation of FCNs that can only operate on small kernels, which tends to dismiss global context and only focus on local patterns. Results: We purposely selected three MSK imaging datasets covering a variety of structures to evaluate the generalizability of the proposed method. Our semi-automatic HCTN method achieved a dice coefficient score (DSC) of 88.46 +/- 9.41 for segmenting the soft-tissue sarcoma tumors from magnetic resonance (MR) images, 73.32 +/- 11.97 for segmenting the osteosarcoma tumors from MR images and 93.93 +/- 1.84 for segmenting the clavicle bones from chest radiographs. When compared to the current state-of-the-art automatic segmentation method, our HCTN method is 11.7%, 19.11% and 7.36% higher in DSC on the three datasets, respectively. Conclusion: Our experimental results demonstrate that HCTN achieved more generalizable results than the current methods, especially with challenging MSK studies.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A hierarchical CNN-Transformer model for network intrusion detection
    Luo, Sijie
    Zhao, Zhiheng
    Hu, Qiyuan
    Liu, Yang
    [J]. 2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
  • [22] A Semantic Perception and CNN-Transformer Hybrid Network for Occluded Person Re-Identification
    Gao, Zan
    Chen, Peng
    Zhuo, Tao
    Liu, Meng
    Zhu, Lei
    Wang, Meng
    Chen, Shengyong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2010 - 2025
  • [23] HCTA-Net: A Hybrid CNN-Transformer Attention Network for Surgical Instrument Segmentation
    Yang, Lei
    Wang, Hongyong
    Bian, Guibin
    Liu, Yanhong
    [J]. IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2023, 5 (04): : 929 - 944
  • [24] View-independent gait events detection using CNN-transformer hybrid network
    Jamsrandorj, Ankhzaya
    Jung, Dawoon
    Kumar, Konki Sravan
    Arshad, Muhammad Zeeshan
    Lim, Hwasup
    Kim, Jinwook
    Mun, Kyung-Ryoul
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 147
  • [25] Semhybridnet: a semantically enhanced hybrid CNN-transformer network for radar pulse image segmentation
    Hongjia Liu
    Yubin Xiao
    Xuan Wu
    Yuanshu Li
    Peng Zhao
    Yanchun Liang
    Liupu Wang
    You Zhou
    [J]. Complex & Intelligent Systems, 2024, 10 : 2851 - 2868
  • [26] RingMo-Lite: A Remote Sensing Lightweight Network With CNN-Transformer Hybrid Framework
    Wang, Yuelei
    Zhang, Ting
    Zhao, Liangjin
    Hu, Lin
    Wang, Zhechao
    Niu, Ziqing
    Cheng, Peirui
    Chen, Kaiqiang
    Zeng, Xuan
    Wang, Zhirui
    Wang, Hongqi
    Sun, Xian
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 20
  • [27] A joint learning framework for multisite CBCT-to-CT translation using a hybrid CNN-transformer synthesizer and a registration network
    Hu, Ying
    Cheng, Mengjie
    Wei, Hui
    Liang, Zhiwen
    [J]. FRONTIERS IN ONCOLOGY, 2024, 14
  • [28] Gaze-Swin: Enhancing Gaze Estimation with a Hybrid CNN-Transformer Network and Dropkey Mechanism
    Zhao, Ruijie
    Wang, Yuhuan
    Luo, Sihui
    Shou, Suyao
    Tang, Pinyan
    [J]. ELECTRONICS, 2024, 13 (02)
  • [29] A Two-stage hybrid CNN-Transformer Network for RGB Guided Indoor Depth Completion
    Deng, Yufan
    Deng, Xin
    Xu, Mai
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1127 - 1132
  • [30] Hybrid Time Distributed CNN-transformer for Speech Emotion Recognition
    Slimi, Anwer
    Nicolas, Henri
    Zrigui, Mounir
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES (ICSOFT), 2022, : 602 - 611