A spatio-temporal network for video semantic segmentation in surgical videos

被引：4

作者：

Grammatikopoulou, Maria ^{[1
]}

Sanchez-Matilla, Ricardo ^{[1
]}

Bragman, Felix ^{[1
]}

Owen, David ^{[1
]}

Culshaw, Lucy ^{[1
]}

Kerr, Karen ^{[1
]}

Stoyanov, Danail ^{[1
,2
]}

Luengo, Imanol ^{[1
]}

机构：

[1] Medtronic Plc, London, England

[2] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci, London, England

来源：

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY | 2023年 / 19卷 / 2期

关键词：

Video segmentation; Semantic segmentation;

D O I：

10.1007/s11548-023-02971-6

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

PurposeSemantic segmentation in surgical videos has applications in intra-operative guidance, post-operative analytics and surgical education. Models need to provide accurate predictions since temporally inconsistent identification of anatomy can hinder patient safety. We propose a novel architecture for modelling temporal relationships in videos to address these issues.MethodsWe developed a temporal segmentation model that includes a static encoder and a spatio-temporal decoder. The encoder processes individual frames whilst the decoder learns spatio-temporal relationships from frame sequences. The decoder can be used with any suitable encoder to improve temporal consistency.ResultsModel performance was evaluated on the CholecSeg8k dataset and a private dataset of robotic Partial Nephrectomy procedures. Mean Intersection over Union improved by 1.30% and 4.27% respectively for each dataset when the temporal decoder was applied. Our model also displayed improvements in temporal consistency up to 7.23%.ConclusionsThis work demonstrates an advance in video segmentation of surgical scenes with potential applications in surgery with a view to improve patient outcomes. The proposed decoder can extend state-of-the-art static models, and it is shown that it can improve per-frame segmentation output and video temporal consistency.

引用

页码：375 / 382

页数：8

共 50 条

[11] Spatio-temporal Semantic Segmentation for Drone Detection
Craye, Celine
Ardjoune, Salem
2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
[12] Video segmentation using spatio-temporal information
Kim, YW
Ho, YS
IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 785 - 788
[13] Video Segmentation by Spatio-temporal Random Walk
Chang, Jing
Wang, Hui
PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON E-BUSINESS, INFORMATION MANAGEMENT AND COMPUTER SCIENCE, 2018, : 54 - 58
[14] Video region segmentation by spatio-temporal watersheds
El Saban, MA
Manjunath, BS
2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 1, PROCEEDINGS, 2003, : 349 - 352
[15] Automatic spatio-temporal video sequence segmentation
Vass, J
Palaniappan, K
Zhuang, XH
1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 958 - 962
[16] Surgical tool segmentation and localization using spatio-temporal deep network
Kanakatte, Aparna
Ramaswamy, Akshaya
Gubbi, Jayavardhana
Ghose, Avik
Purushothaman, Balamuralidhar
42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 1658 - 1661
[17] Spatio-Temporal Graph-based Semantic Compositional Network for Video Captioning
Li, Shun
Zhang, Ze-Fan
Ji, Yi
Li, Ying
Liu, Chun-Ping
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[18] Semantic object segmentation by a spatio-temporal MRF model
Zeng, W
Gao, W
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 775 - 778
[19] Spatio-temporal Predictive Network For Videos With Physical Properties
Aoyagi, Yuka
Murata, Noboru
Sakaino, Hidetomo
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2268 - 2278
[20] Spatio-temporal joint probability images for video segmentation
Li, ZN
Wei, J
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 295 - 298

← 1 2 3 4 5 →