LRTD: long-range temporal dependency based active learning for surgical workflow recognition

被引:19
|
作者
Shi, Xueying [1 ]
Jin, Yueming [1 ]
Dou, Qi [1 ]
Heng, Pheng-Ann [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Surgical workflow recognition; Active learning; Long-range temporal dependency; Intra-clip dependency; SEGMENTATION; TASKS;
D O I
10.1007/s11548-020-02198-9
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Purpose Automatic surgical workflow recognition in video is an essentially fundamental yet challenging problem for developing computer-assisted and robotic-assisted surgery. Existing approaches with deep learning have achieved remarkable performance on analysis of surgical videos, however, heavily relying on large-scale labelled datasets. Unfortunately, the annotation is not often available in abundance, because it requires the domain knowledge of surgeons. Even for experts, it is very tedious and time-consuming to do a sufficient amount of annotations. Methods In this paper, we propose a novel active learning method for cost-effective surgical video analysis. Specifically, we propose a non-local recurrent convolutional network, which introduces non-local block to capture the long-range temporal dependency (LRTD) among continuous frames. We then formulate an intra-clip dependency score to represent the overall dependency within this clip. By ranking scores among clips in unlabelled data pool, we select the clips with weak dependencies to annotate, which indicates the most informative ones to better benefit network training. Results We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task. By using our LRTD based selection strategy, we can outperform other state-of-the-art active learning methods who only consider neighbor-frame information. Using only up to 50% of samples, our approach can exceed the performance of full-data training. Conclusion By modeling the intra-clip dependency, our LRTD based strategy shows stronger capability to select informative video clips for annotation compared with other active learning methods, through the evaluation on a popular public surgical dataset. The results also show the promising potential of our framework for reducing annotation workload in the clinical practice.
引用
收藏
页码:1573 / 1584
页数:12
相关论文
共 50 条
  • [1] LRTD: long-range temporal dependency based active learning for surgical workflow recognition
    Xueying Shi
    Yueming Jin
    Qi Dou
    Pheng-Ann Heng
    International Journal of Computer Assisted Radiology and Surgery, 2020, 15 : 1573 - 1584
  • [2] Active Learning With Long-Range Observation
    Lee, Jiho
    Kim, Eunwoo
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1990 - 1994
  • [3] Action Recognition with Bootstrapping based Long-range Temporal Context Attention
    Liu, Ziming
    Gao, Guangyu
    Qin, A. K.
    Wu, Tong
    Liu, Chi Harold
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 583 - 591
  • [4] Deep video compression based on Long-range Temporal Context Learning
    Wu, Kejun
    Li, Zhenxing
    Yang, You
    Liu, Qiong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 248
  • [5] Against spatial–temporal discrepancy: contrastive learning-based network for surgical workflow recognition
    Tong Xia
    Fucang Jia
    International Journal of Computer Assisted Radiology and Surgery, 2021, 16 : 839 - 848
  • [6] Learning Long-Range Relationships for Temporal Aircraft Anomaly Detection
    Zhang, Da
    Gao, Junyu
    Li, Xuelong
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2024, 60 (05) : 6385 - 6395
  • [7] ECG-based cardiac arrhythmias detection through ensemble learning and fusion of deep spatial-temporal and long-range dependency features
    Din, Sadia
    Qaraqe, Marwa
    Mourad, Omar
    Qaraqe, Khalid
    Serpedin, Erchin
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 150
  • [8] Against spatial-temporal discrepancy: contrastive learning-based network for surgical workflow recognition
    Xia, Tong
    Jia, Fucang
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2021, 16 (05) : 839 - 848
  • [9] Temporal-based Swin Transformer network for workflow recognition of surgical video
    Pan, Xiaoying
    Gao, Xuanrong
    Wang, Hongyu
    Zhang, Wuxia
    Mu, Yuanzhen
    He, Xianli
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (01) : 139 - 147
  • [10] Temporal-based Swin Transformer network for workflow recognition of surgical video
    Xiaoying Pan
    Xuanrong Gao
    Hongyu Wang
    Wuxia Zhang
    Yuanzhen Mu
    Xianli He
    International Journal of Computer Assisted Radiology and Surgery, 2023, 18 : 139 - 147