Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance

被引:5
|
作者
Urrea, Claudio [1 ]
Garcia-Garcia, Yainet [1 ]
Kern, John [1 ]
机构
[1] Univ Santiago Chile, Fac Engn, Elect Engn Dept, Las Sophoras 165, Santiago 9170020, Chile
关键词
deep learning for laparoscopic surgery; class imbalance; activation function; loss function; Adam and SGDM optimizers; semantic segmentation of the surgical scene;
D O I
10.3390/biomedicines12061309
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.
引用
收藏
页数:30
相关论文
共 50 条
  • [31] Deep Learning Approach for Multi-class Semantic Segmentation of UAV Images
    Chouhan, Avinash
    Chutia, Dibyajyoti
    Aggarwal, Shiv Prasad
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2023, 32 (07)
  • [32] Attention Mechanism for Improving Facial Landmark Semantic Segmentation
    Kim, Hyungjoon
    Kim, Hyeonwoo
    Cho, Seongkuk
    Hwang, Eenjun
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND APPLIED COGNITIVE COMPUTING, 2021, : 817 - 824
  • [33] Deep semantic learning for acoustic scene classification
    Yun-Fei Shao
    Xin-Xin Ma
    Yong Ma
    Wei-Qiang Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2024
  • [34] Deep semantic learning for acoustic scene classification
    Shao, Yun-Fei
    Ma, Xin-Xin
    Ma, Yong
    Zhang, Wei-Qiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [35] Scene Adaptation for Semantic Segmentation using Adversarial Learning
    Di Mauro, D.
    Furnari, A.
    Patane, G.
    Battiato, S.
    Farinella, G. M.
    2018 15TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2018, : 97 - 102
  • [36] RoseSegNet: An attention-based deep learning architecture for organ segmentation of plants
    Turgut, Kaya
    Dutagaci, Helin
    Rousseau, David
    BIOSYSTEMS ENGINEERING, 2022, 221 : 138 - 153
  • [37] Dual attention-based deep learning network for multi-class object semantic segmentation of tunnel point clouds
    Ji, Ankang
    Zhang, Limao
    Fan, Hongqin
    Xue, Xiaolong
    Dou, Yudan
    AUTOMATION IN CONSTRUCTION, 2023, 156
  • [38] Improving Surgical Models through One/Two Class Learning
    Chia, Chih-Chun
    Karam, Zahi
    Lee, Gyemin
    Rubinfeld, Ilan
    Syed, Zeeshan
    2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 5098 - 5101
  • [39] Deep semantic segmentation for visual scene understanding of soil types
    Zamani, Vahid
    Taghaddos, Hosein
    Gholipour, Yaghob
    Pourreza, Hamidreza
    AUTOMATION IN CONSTRUCTION, 2022, 140
  • [40] Road Scene Segmentation Based on Deep Learning
    Zheng, Ke
    Naji, Hasan Abdullah Hasan
    IEEE Access, 2020, 8 : 140964 - 140971