Exploring Segment-Level Semantics for Online Phase Recognition From Surgical Videos

被引:22
|
作者
Ding, Xinpeng [1 ]
Li, Xiaomeng [1 ,2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 518057, Peoples R China
关键词
Surgery; Videos; Feature extraction; Semantics; Hidden Markov models; Task analysis; Convolution; Surgical video analysis; surgical phase recognition; REAL-TIME SEGMENTATION; WORKFLOW RECOGNITION; TASKS;
D O I
10.1109/TMI.2022.3182995
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automatic surgical phase recognition plays a vital role in robot-assisted surgeries. Existing methods ignored a pivotal problem that surgical phases should be classified by learning segment-level semantics instead of solely relying on frame-wise information. This paper presents a segment-attentive hierarchical consistency network (SAHC) for surgical phase recognition from videos. The key idea is to extract hierarchical high-level semantic-consistent segments and use them to refine the erroneous predictions caused by ambiguous frames. To achieve it, we design a temporal hierarchical network to generate hierarchical high-level segments. Then, we introduce a hierarchical segment-frame attention module to capture relations between the low-level frames and high-level segments. By regularizing the predictions of frames and their corresponding segments via a consistency loss, the network can generate semantic-consistent segments and then rectify the misclassified predictions caused by ambiguous low-level frames. We validate SAHC on two public surgical video datasets, i.e., the M2CAI16 challenge dataset and the Cholec80 dataset. Experimental results show that our method outperforms previous state-of-the-arts and ablation studies prove the effectiveness of our proposed modules. Our code has been released at: https://github.com/xmed-lab/SAHC.
引用
收藏
页码:3309 / 3319
页数:11
相关论文
共 50 条
  • [31] Exploring the role of semantics in bilingual word recognition: Evidence from interlingual homophones
    Friesen, Deanna
    Jared, Debra
    CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2008, 62 (04): : 270 - 270
  • [32] PointNetPGAP-SLC: A 3D LiDAR-Based Place Recognition Approach With Segment-Level Consistency Training for Mobile Robots in Horticulture
    Barros, T.
    Garrote, L.
    Conde, P.
    Coombes, M. J.
    Liu, C.
    Premebida, C.
    Nunes, U. J.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (11): : 10471 - 10478
  • [33] TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos
    Sanat Ramesh
    Diego Dall’Alba
    Cristians Gonzalez
    Tong Yu
    Pietro Mascagni
    Didier Mutter
    Jacques Marescaux
    Paolo Fiorini
    Nicolas Padoy
    International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 1665 - 1672
  • [34] TRandAugment: temporal random augmentation strategy for surgical activity recognition from videos
    Ramesh, Sanat
    Dall'Alba, Diego
    Gonzalez, Cristians
    Yu, Tong
    Mascagni, Pietro
    Mutter, Didier
    Marescaux, Jacques
    Fiorini, Paolo
    Padoy, Nicolas
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (09) : 1665 - 1672
  • [35] A Multimodal Transformer Model for Recognition of Images from Complex Laparoscopic Surgical Videos
    Abiyev, Rahib H.
    Altabel, Mohamad Ziad
    Darwish, Manal
    Helwan, Abdulkader
    DIAGNOSTICS, 2024, 14 (07)
  • [36] Experience Level Influences Users' Interactions With and Expectations For Online Surgical Videos: A Mixed-Methods Study
    London, Daniel A.
    Zastrow, Ryley K.
    Gluck, Matthew J.
    Cagle, Paul J.
    JOURNAL OF HAND SURGERY-AMERICAN VOLUME, 2021, 46 (07): : 560 - 574
  • [37] PATG: position-aware temporal graph networks for surgical phase recognition on laparoscopic videos
    Kadkhodamohammadi, Abdolrahim
    Luengo, Imanol
    Stoyanov, Danail
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (05) : 849 - 856
  • [38] PATG: position-aware temporal graph networks for surgical phase recognition on laparoscopic videos
    Abdolrahim Kadkhodamohammadi
    Imanol Luengo
    Danail Stoyanov
    International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 849 - 856
  • [39] A new segment-level mixing strategy to improve the gas separation performances of carbon molecular sieve membrane derived from polymer blends
    Chen, Zi-An
    Zhao, Bingyu
    Xin, Junhao
    Liu, Yaodong
    JOURNAL OF MEMBRANE SCIENCE, 2024, 695
  • [40] SKiT: a Fast Key Information Video Transformer for Online Surgical Phase Recognition
    Liu, Yang
    Huo, Jiayu
    Peng, Jingjing
    Sparks, Rachel
    Dasgupta, Prokar
    Granados, Alejandro
    Ourselin, Sebastien
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21017 - 21027