Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance

被引：5

作者：

Urrea, Claudio ^{[1
]}

Garcia-Garcia, Yainet ^{[1
]}

Kern, John ^{[1
]}

机构：

[1] Univ Santiago Chile, Fac Engn, Elect Engn Dept, Las Sophoras 165, Santiago 9170020, Chile

来源：

BIOMEDICINES | 2024年 / 12卷 / 06期

关键词：

deep learning for laparoscopic surgery; class imbalance; activation function; loss function; Adam and SGDM optimizers; semantic segmentation of the surgical scene;

D O I：

10.3390/biomedicines12061309

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.

引用

页数：30

共 50 条

[1] A Deep Learning Semantic Segmentation Method for Landslide Scene Based on Transformer Architecture
Wang, Zhaoqiu
Sun, Tao
Hu, Kun
Zhang, Yueting
Yu, Xiaqiong
Li, Ying
SUSTAINABILITY, 2022, 14 (23)
[2] Enhancing weakly supervised semantic segmentation through multi-class token attention learning
Luo, Huilan
Zeng, Zhen
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[3] Semantic Scene Segmentation for Indoor Robot Navigation via Deep Learning
Yeboah, Yao
Cai Yanguang
Wei Wu
Farisi, Zeyad
PROCEEDINGS OF ICRCA 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION / ICRMV 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS AND MACHINE VISION, 2018, : 112 - 118
[4] A Deep Learning Architecture for Semantic Segmentation of Radar Sounder Data
Donini, Elena
Bovolo, Francesca
Bruzzone, Lorenzo
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[5] Class Attention Transfer for Semantic Segmentation
Cho, Yubin
Kang, Sukju
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 41 - 45
[6] Image synthesis with class-aware semantic diffusion models for surgical scene segmentation
Zhou, Yihang
Towning, Rebecca
Awad, Zaid
Giannarou, Stamatia
HEALTHCARE TECHNOLOGY LETTERS, 2025, 12 (01)
[7] SACANet: scene-aware class attention network for semantic segmentation of remote sensing images
Ma, Xiaowen
Che, Rui
Hong, Tingfeng
Ma, Mengting
Zhao, Ziyan
Feng, Tian
Zhang, Wei
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 828 - 833
[8] Context Label Learning: Improving Background Class Representations in Semantic Segmentation
Li, Zeju
Kamnitsas, Konstantinos
Ouyang, Cheng
Chen, Chen
Glocker, Ben
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (06) : 1885 - 1896
[9] Improving Binary Semantic Scene Segmentation for Robotics Applications
Tzelepi, Maria
Tragkas, Nikolaos
Tefas, Anastasios
ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EAAAI/EANN 2022, 2022, 1600 : 439 - 447
[10] Semantic segmentation of explosive volcanic plumes through deep learning
Wilkes, T. C.
Pering, T. D.
McGonigle, A. J. S.
COMPUTERS & GEOSCIENCES, 2022, 168

← 1 2 3 4 5 →