Multi-Frame Feature Aggregation for Real-Time Instrument Segmentation in Endoscopic Video

被引:15
|
作者
Lin, Shan [1 ]
Qin, Fangbo [2 ]
Peng, Haonan [1 ]
Bly, Randall A. [3 ]
Moe, Kris S. [3 ]
Hannaford, Blake [1 ]
机构
[1] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[2] Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China
[3] UW, Dept Otolaryngol Head & Neck Surg, Seattle, WA 98105 USA
基金
美国国家科学基金会;
关键词
Computer vision for medical robotics; deep learning for visual perception; object detection; segmentation and categorization;
D O I
10.1109/LRA.2021.3096156
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training. We demonstrate that our approach achieves superior performance to corresponding deeper segmentation models on two public surgery datasets.
引用
收藏
页码:6773 / 6780
页数:8
相关论文
共 50 条
  • [31] Real-time segmentation of video on a multiprocessor platform
    Arapis, C
    Gibbs, S
    Breiteneder, C
    PARALLEL COMPUTING, 1997, 23 (12) : 1777 - 1792
  • [32] SwiftNet: Real-time Video Object Segmentation
    Wang, Haochen
    Jiang, Xiaolong
    Ren, Haibing
    Hu, Yao
    Bai, Song
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1296 - 1305
  • [33] AN EXTENDED REAL-TIME COMPRESSIVE TRACKING METHOD USING WEIGHTED MULTI-FRAME COSINE SIMILARITY METRIC
    Jenkins, Mark David
    Barrie, Peter
    Buggy, Tom
    Morison, Gordon
    2014 6TH EUROPEAN EMBEDDED DESIGN IN EDUCATION AND RESEARCH CONFERENCE (EDERC), 2014, : 147 - 151
  • [34] A Multi-level Feature Fusion Network for Real-time Semantic Segmentation
    Wang, Lu
    Xu, Qinzhen
    Xiong, Zixiang
    Huang, Yongming
    Yang, Luxi
    2019 11TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2019,
  • [35] Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
    Shang, Wei
    Ren, Dongwei
    Yang, Yi
    Zhang, Hongzhi
    Ma, Kede
    Zuo, Wangmeng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13935 - 13944
  • [36] AWFA-LPD: Adaptive Weight Feature Aggregation for Multi-frame License Plate Detection
    Lu, Xiaocheng
    Yuan, Yuan
    Wang, Qi
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 476 - 480
  • [37] Augmenting efficient real-time surgical instrument segmentation in video with point tracking and Segment Anything
    Wu, Zijian
    Schmidt, Adam
    Kazanzides, Peter
    Salcudean, Septimiu E.
    HEALTHCARE TECHNOLOGY LETTERS, 2025, 12 (01)
  • [38] Efficient global-local memory for real-time instrument segmentation of robotic surgical video
    Wang, Jiacheng
    Jin, Yueming
    Wang, Liansheng
    Cai, Shuntian
    Heng, Pheng-Ann
    Qin, Jing
    arXiv, 2021,
  • [39] Efficient Global-Local Memory for Real-Time Instrument Segmentation of Robotic Surgical Video
    Wang, Jiacheng
    Jin, Yueming
    Wang, Liansheng
    Cai, Shuntian
    Heng, Pheng-Ann
    Qin, Jing
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT IV, 2021, 12904 : 341 - 351
  • [40] Real-time multi-feature based fire flame detection in video
    Chi, Rui
    Lu, Zhe-Ming
    Ji, Qing-Ge
    IET IMAGE PROCESSING, 2017, 11 (01) : 31 - 37