A spatio-temporal network for video semantic segmentation in surgical videos

被引：4

作者：

Grammatikopoulou, Maria ^{[1
]}

Sanchez-Matilla, Ricardo ^{[1
]}

Bragman, Felix ^{[1
]}

Owen, David ^{[1
]}

Culshaw, Lucy ^{[1
]}

Kerr, Karen ^{[1
]}

Stoyanov, Danail ^{[1
,2
]}

Luengo, Imanol ^{[1
]}

机构：

[1] Medtronic Plc, London, England

[2] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci, London, England

来源：

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY | 2023年 / 19卷 / 2期

关键词：

Video segmentation; Semantic segmentation;

D O I：

10.1007/s11548-023-02971-6

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

PurposeSemantic segmentation in surgical videos has applications in intra-operative guidance, post-operative analytics and surgical education. Models need to provide accurate predictions since temporally inconsistent identification of anatomy can hinder patient safety. We propose a novel architecture for modelling temporal relationships in videos to address these issues.MethodsWe developed a temporal segmentation model that includes a static encoder and a spatio-temporal decoder. The encoder processes individual frames whilst the decoder learns spatio-temporal relationships from frame sequences. The decoder can be used with any suitable encoder to improve temporal consistency.ResultsModel performance was evaluated on the CholecSeg8k dataset and a private dataset of robotic Partial Nephrectomy procedures. Mean Intersection over Union improved by 1.30% and 4.27% respectively for each dataset when the temporal decoder was applied. Our model also displayed improvements in temporal consistency up to 7.23%.ConclusionsThis work demonstrates an advance in video segmentation of surgical scenes with potential applications in surgery with a view to improve patient outcomes. The proposed decoder can extend state-of-the-art static models, and it is shown that it can improve per-frame segmentation output and video temporal consistency.

引用

页码：375 / 382

页数：8

共 50 条

[1] A spatio-temporal network for video semantic segmentation in surgical videos
Maria Grammatikopoulou
Ricardo Sanchez-Matilla
Felix Bragman
David Owen
Lucy Culshaw
Karen Kerr
Danail Stoyanov
Imanol Luengo
International Journal of Computer Assisted Radiology and Surgery, 2024, 19 : 375 - 382
[2] A spatio-temporal network for video semantic segmentation in surgical videos
Grammatikopoulou, Maria
Sanchez-Matilla, Ricardo
Bragman, Felix
Owen, David
Culshaw, Lucy
Kerr, Karen
Stoyanov, Danail
Luengo, Imanol
arXiv, 2023,
[3] Semantic spatio-temporal segmentation for extracting video objects
Mao, JH
Ma, KK
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 1, 1999, : 738 - 743
[4] Capturing the spatio-temporal continuity for video semantic segmentation
Chen, Xin
Wu, Aming
Han, Yahong
IET IMAGE PROCESSING, 2019, 13 (14) : 2813 - 2820
[5] Learning Deep Spatio-Temporal Dependence for Semantic Video Segmentation
Qiu, Zhaofan
Yao, Ting
Mei, Tao
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (04) : 939 - 949
[6] Spatio-temporal Attention Network for Video Instance Segmentation
Liu, Xiaoyu
Ren, Haibing
Ye, Tingmeng
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 725 - 727
[7] Video object segmentation using spatio-temporal deep network
Ramaswamy, Akshaya
Gubbi, Jayavardhana
Balamuralidhar, P.
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[8] Spatio-temporal segmentation for video surveillance
Sun, HZ
Tan, TN
ELECTRONICS LETTERS, 2001, 37 (01) : 20 - 21
[9] Video Segmentation with Spatio-Temporal Tubes
Trichet, Remi
Nevatia, Ramakant
2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 330 - 335
[10] Spatio-temporal segmentation for video surveillance
Sun, HZ
Feng, T
Tan, TN
15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 843 - 846

← 1 2 3 4 5 →