Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance

被引：5

作者：

Urrea, Claudio ^{[1
]}

Garcia-Garcia, Yainet ^{[1
]}

Kern, John ^{[1
]}

机构：

[1] Univ Santiago Chile, Fac Engn, Elect Engn Dept, Las Sophoras 165, Santiago 9170020, Chile

来源：

BIOMEDICINES | 2024年 / 12卷 / 06期

关键词：

deep learning for laparoscopic surgery; class imbalance; activation function; loss function; Adam and SGDM optimizers; semantic segmentation of the surgical scene;

D O I：

10.3390/biomedicines12061309

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.

引用

页数：30

共 50 条

[31] Deep Learning Approach for Multi-class Semantic Segmentation of UAV Images
Chouhan, Avinash
Chutia, Dibyajyoti
Aggarwal, Shiv Prasad
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2023, 32 (07)
[32] Attention Mechanism for Improving Facial Landmark Semantic Segmentation
Kim, Hyungjoon
Kim, Hyeonwoo
Cho, Seongkuk
Hwang, Eenjun
ADVANCES IN ARTIFICIAL INTELLIGENCE AND APPLIED COGNITIVE COMPUTING, 2021, : 817 - 824
[33] Deep semantic learning for acoustic scene classification
Yun-Fei Shao
Xin-Xin Ma
Yong Ma
Wei-Qiang Zhang
EURASIP Journal on Audio, Speech, and Music Processing, 2024
[34] Deep semantic learning for acoustic scene classification
Shao, Yun-Fei
Ma, Xin-Xin
Ma, Yong
Zhang, Wei-Qiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
[35] Scene Adaptation for Semantic Segmentation using Adversarial Learning
Di Mauro, D.
Furnari, A.
Patane, G.
Battiato, S.
Farinella, G. M.
2018 15TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2018, : 97 - 102
[36] RoseSegNet: An attention-based deep learning architecture for organ segmentation of plants
Turgut, Kaya
Dutagaci, Helin
Rousseau, David
BIOSYSTEMS ENGINEERING, 2022, 221 : 138 - 153
[37] Dual attention-based deep learning network for multi-class object semantic segmentation of tunnel point clouds
Ji, Ankang
Zhang, Limao
Fan, Hongqin
Xue, Xiaolong
Dou, Yudan
AUTOMATION IN CONSTRUCTION, 2023, 156
[38] Improving Surgical Models through One/Two Class Learning
Chia, Chih-Chun
Karam, Zahi
Lee, Gyemin
Rubinfeld, Ilan
Syed, Zeeshan
2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 5098 - 5101
[39] Deep semantic segmentation for visual scene understanding of soil types
Zamani, Vahid
Taghaddos, Hosein
Gholipour, Yaghob
Pourreza, Hamidreza
AUTOMATION IN CONSTRUCTION, 2022, 140
[40] Road Scene Segmentation Based on Deep Learning
Zheng, Ke
Naji, Hasan Abdullah Hasan
IEEE Access, 2020, 8 : 140964 - 140971

← 1 2 3 4 5 →