Exploring Segment-Level Semantics for Online Phase Recognition From Surgical Videos

被引:22
|
作者
Ding, Xinpeng [1 ]
Li, Xiaomeng [1 ,2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 518057, Peoples R China
关键词
Surgery; Videos; Feature extraction; Semantics; Hidden Markov models; Task analysis; Convolution; Surgical video analysis; surgical phase recognition; REAL-TIME SEGMENTATION; WORKFLOW RECOGNITION; TASKS;
D O I
10.1109/TMI.2022.3182995
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automatic surgical phase recognition plays a vital role in robot-assisted surgeries. Existing methods ignored a pivotal problem that surgical phases should be classified by learning segment-level semantics instead of solely relying on frame-wise information. This paper presents a segment-attentive hierarchical consistency network (SAHC) for surgical phase recognition from videos. The key idea is to extract hierarchical high-level semantic-consistent segments and use them to refine the erroneous predictions caused by ambiguous frames. To achieve it, we design a temporal hierarchical network to generate hierarchical high-level segments. Then, we introduce a hierarchical segment-frame attention module to capture relations between the low-level frames and high-level segments. By regularizing the predictions of frames and their corresponding segments via a consistency loss, the network can generate semantic-consistent segments and then rectify the misclassified predictions caused by ambiguous low-level frames. We validate SAHC on two public surgical video datasets, i.e., the M2CAI16 challenge dataset and the Cholec80 dataset. Experimental results show that our method outperforms previous state-of-the-arts and ablation studies prove the effectiveness of our proposed modules. Our code has been released at: https://github.com/xmed-lab/SAHC.
引用
收藏
页码:3309 / 3319
页数:11
相关论文
共 50 条
  • [21] Deep learning for surgical phase recognition using endoscopic videos
    Annetje C. P. Guédon
    Senna E. P. Meij
    Karim N. M. M. H. Osman
    Helena A. Kloosterman
    Karlijn J. van Stralen
    Matthijs C. M. Grimbergen
    Quirijn A. J. Eijsbouts
    John J. van den Dobbelsteen
    Andru P. Twinanda
    Surgical Endoscopy, 2021, 35 : 6150 - 6157
  • [22] Deep learning for surgical phase recognition using endoscopic videos
    Guedon, Annetje C. P.
    Meij, Senna E. P.
    Osman, Karim N. M. M. H.
    Kloosterman, Helena A.
    van Stralen, Karlijn J.
    Grimbergen, Matthijs C. M.
    Eijsbouts, Quirijn A. J.
    van den Dobbelsteen, John J.
    Twinanda, Andru P.
    SURGICAL ENDOSCOPY AND OTHER INTERVENTIONAL TECHNIQUES, 2021, 35 (11): : 6150 - 6157
  • [23] Applying Segment-Level Attention on Bi-Modal Transformer Encoder for Audio-Visual Emotion Recognition
    Hsu, Jia-Hao
    Wu, Chung-Hsien
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 3231 - 3243
  • [24] Phase Recognition and Cautery Localization in Basal Cell Carcinoma Surgical Videos
    March, L.
    Rodgers, J. R.
    Jamzad, A.
    Santilli, A. M. L.
    Hisey, R.
    McKay, D.
    Rudan, J. F.
    Kaufmann, M.
    Ren, K. Y. M.
    Fichtinger, G.
    Mousavi, P.
    MEDICAL IMAGING 2022: IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, 2022, 12034
  • [25] Segment-Level Joint Topic-Sentiment Model for Online Review Analysis (vol 34, pg 43, 2019)
    Xie, Haoran
    IEEE INTELLIGENT SYSTEMS, 2019, 34 (02) : 82 - 82
  • [26] Emotion Recognition from Varying Length Patterns of Speech using CNN-based Segment-Level Pyramid Match Kernel based SVMs
    Gupta, Shikha
    De, Kishalaya
    Dinesh, Dileep Aroor
    Thenkanidiyoor, Veena
    2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2019,
  • [27] Surgical phase recognition in laparoscopic videos using gated capsule autoencoder model
    Konduri, Praveen S. R.
    Rao, G. Siva Nageswara
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2023, 11 (05): : 1973 - 1995
  • [28] Hard Frame Detection and Online Mapping for Surgical Phase Recognition
    Yi, Fangqiu
    Jiang, Tingting
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT V, 2019, 11768 : 449 - 457
  • [29] Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer
    Gao, Xiaojie
    Jin, Yueming
    Long, Yonghao
    Dou, Qi
    Heng, Pheng-Ann
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT IV, 2021, 12904 : 593 - 603
  • [30] A Topic-Based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews
    Kim, Sunghoon
    Lee, Sanghak
    McCulloch, Robert
    JOURNAL OF MARKETING RESEARCH, 2024, 61 (06) : 1132 - 1151