A spatio-temporal network for video semantic segmentation in surgical videos

被引:4
|
作者
Grammatikopoulou, Maria [1 ]
Sanchez-Matilla, Ricardo [1 ]
Bragman, Felix [1 ]
Owen, David [1 ]
Culshaw, Lucy [1 ]
Kerr, Karen [1 ]
Stoyanov, Danail [1 ,2 ]
Luengo, Imanol [1 ]
机构
[1] Medtronic Plc, London, England
[2] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci, London, England
关键词
Video segmentation; Semantic segmentation;
D O I
10.1007/s11548-023-02971-6
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
PurposeSemantic segmentation in surgical videos has applications in intra-operative guidance, post-operative analytics and surgical education. Models need to provide accurate predictions since temporally inconsistent identification of anatomy can hinder patient safety. We propose a novel architecture for modelling temporal relationships in videos to address these issues.MethodsWe developed a temporal segmentation model that includes a static encoder and a spatio-temporal decoder. The encoder processes individual frames whilst the decoder learns spatio-temporal relationships from frame sequences. The decoder can be used with any suitable encoder to improve temporal consistency.ResultsModel performance was evaluated on the CholecSeg8k dataset and a private dataset of robotic Partial Nephrectomy procedures. Mean Intersection over Union improved by 1.30% and 4.27% respectively for each dataset when the temporal decoder was applied. Our model also displayed improvements in temporal consistency up to 7.23%.ConclusionsThis work demonstrates an advance in video segmentation of surgical scenes with potential applications in surgery with a view to improve patient outcomes. The proposed decoder can extend state-of-the-art static models, and it is shown that it can improve per-frame segmentation output and video temporal consistency.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 50 条
  • [11] Spatio-temporal Semantic Segmentation for Drone Detection
    Craye, Celine
    Ardjoune, Salem
    2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
  • [12] Video segmentation using spatio-temporal information
    Kim, YW
    Ho, YS
    IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 785 - 788
  • [13] Video Segmentation by Spatio-temporal Random Walk
    Chang, Jing
    Wang, Hui
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON E-BUSINESS, INFORMATION MANAGEMENT AND COMPUTER SCIENCE, 2018, : 54 - 58
  • [14] Video region segmentation by spatio-temporal watersheds
    El Saban, MA
    Manjunath, BS
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 1, PROCEEDINGS, 2003, : 349 - 352
  • [15] Automatic spatio-temporal video sequence segmentation
    Vass, J
    Palaniappan, K
    Zhuang, XH
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 958 - 962
  • [16] Surgical tool segmentation and localization using spatio-temporal deep network
    Kanakatte, Aparna
    Ramaswamy, Akshaya
    Gubbi, Jayavardhana
    Ghose, Avik
    Purushothaman, Balamuralidhar
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 1658 - 1661
  • [17] Spatio-Temporal Graph-based Semantic Compositional Network for Video Captioning
    Li, Shun
    Zhang, Ze-Fan
    Ji, Yi
    Li, Ying
    Liu, Chun-Ping
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [18] Semantic object segmentation by a spatio-temporal MRF model
    Zeng, W
    Gao, W
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, 2004, : 775 - 778
  • [19] Spatio-temporal Predictive Network For Videos With Physical Properties
    Aoyagi, Yuka
    Murata, Noboru
    Sakaino, Hidetomo
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2268 - 2278
  • [20] Spatio-temporal joint probability images for video segmentation
    Li, ZN
    Wei, J
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 295 - 298