Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance

被引:5
|
作者
Urrea, Claudio [1 ]
Garcia-Garcia, Yainet [1 ]
Kern, John [1 ]
机构
[1] Univ Santiago Chile, Fac Engn, Elect Engn Dept, Las Sophoras 165, Santiago 9170020, Chile
关键词
deep learning for laparoscopic surgery; class imbalance; activation function; loss function; Adam and SGDM optimizers; semantic segmentation of the surgical scene;
D O I
10.3390/biomedicines12061309
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] Survey on deep learning with class imbalance
    Johnson, Justin M.
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [22] Survey on deep learning with class imbalance
    Justin M. Johnson
    Taghi M. Khoshgoftaar
    Journal of Big Data, 6
  • [23] Dual Focal Loss to address class imbalance in semantic segmentation
    Hossain, Md Sazzad
    Betts, John M.
    Paplinski, Andrew P.
    Neurocomputing, 2021, 462 : 69 - 87
  • [24] Dual Focal Loss to address class imbalance in semantic segmentation
    Hossain, Md Sazzad
    Betts, John M.
    Paplinski, Andrew P.
    NEUROCOMPUTING, 2021, 462 : 69 - 87
  • [25] A bi-directional deep learning architecture for lung nodule semantic segmentation
    Debnath Bhattacharyya
    N. Thirupathi Rao
    Eali Stephen Neal Joshua
    Yu-Chen Hu
    The Visual Computer, 2023, 39 : 5245 - 5261
  • [26] A Deep Learning-Based Semantic Segmentation Architecture for Autonomous Driving Applications
    Masood, Sharjeel
    Ahmed, Fawad
    Alsuhibany, Suliman A.
    Ghadi, Yazeed Yasin
    Siyal, M. Y.
    Kumar, Harish
    Khan, Khyber
    Ahmad, Jawad
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [27] A bi-directional deep learning architecture for lung nodule semantic segmentation
    Bhattacharyya, Debnath
    Rao, N. Thirupathi
    Joshua, Eali Stephen Neal
    Hu, Yu-Chen
    VISUAL COMPUTER, 2023, 39 (11): : 5245 - 5261
  • [28] How deep learning is empowering semantic segmentation Traditional and deep learning techniques for semantic segmentation: A comparison
    Sehar, Uroosa
    Naseem, Muhammad Luqman
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (21) : 30519 - 30544
  • [29] Improving Scene Recognition through Visual Attention
    Lopez-Garcia, Fernando
    Garcia-Diaz, Anton
    Ramon Fdez-Vidal, Xose
    Manuel Pardo, Xose
    Dosil, Raquel
    Luna, David
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PROCEEDINGS, 2009, 5524 : 16 - +
  • [30] MudrockNet: Semantic segmentation of mudrock SEM images through deep learning
    Bihani, Abhishek
    Daigle, Hugh
    Santos, Javier E.
    Landry, Christopher
    Prodanovic, Masa
    Milliken, Kitty
    COMPUTERS & GEOSCIENCES, 2022, 158