Multi-Frame Feature Aggregation for Real-Time Instrument Segmentation in Endoscopic Video

被引:15
|
作者
Lin, Shan [1 ]
Qin, Fangbo [2 ]
Peng, Haonan [1 ]
Bly, Randall A. [3 ]
Moe, Kris S. [3 ]
Hannaford, Blake [1 ]
机构
[1] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[2] Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China
[3] UW, Dept Otolaryngol Head & Neck Surg, Seattle, WA 98105 USA
基金
美国国家科学基金会;
关键词
Computer vision for medical robotics; deep learning for visual perception; object detection; segmentation and categorization;
D O I
10.1109/LRA.2021.3096156
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training. We demonstrate that our approach achieves superior performance to corresponding deeper segmentation models on two public surgery datasets.
引用
收藏
页码:6773 / 6780
页数:8
相关论文
共 50 条
  • [1] Real-time multi-frame analysis of dominant translation
    Sibiryakov, Alexander
    Bober, Miroslaw
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 55 - +
  • [2] Real-Time Non-Rigid Multi-Frame Depth Video Super-Resolution
    Al Ismaeil, Kassem
    Aouada, Djamila
    Solignac, Thomas
    Mirbach, Bruno
    Ottersten, Bjorn
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2015,
  • [3] Temporal video segmentation for real-time key frame extraction
    Calic, J
    Sav, S
    Izquierdo, E
    Marlow, S
    Murphy, N
    O'Connor, NE
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3632 - 3635
  • [4] DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
    Li, Hanchao
    Xiong, Pengfei
    Fan, Haoqiang
    Sun, Jian
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9514 - 9523
  • [5] Feature extraction method of multi-frame image in cloud video
    Cheng S.
    Diao S.
    Cai S.
    Liu H.
    International Journal of Information and Communication Technology, 2022, 20 (02) : 151 - 163
  • [6] Dynamic Video Image Segmentation Based on Dual Channel Convolutional Kernel and Multi-Frame Feature Fusion
    Chen, Zuguo
    Chen, Chaoyang
    Lu, Ming
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [7] Real-time video segmentation
    Dibos, F
    Pelletier, S
    Koep, G
    AVSS 2005: ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, PROCEEDINGS, 2005, : 382 - 387
  • [8] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Wu, Yun
    Jiang, Jianyong
    Huang, Zimeng
    Tian, Youliang
    APPLIED INTELLIGENCE, 2022, 52 (03) : 3319 - 3336
  • [9] FPANet: Feature pyramid aggregation network for real-time semantic segmentation
    Yun Wu
    Jianyong Jiang
    Zimeng Huang
    Youliang Tian
    Applied Intelligence, 2022, 52 : 3319 - 3336
  • [10] Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation
    Wu, Lingyun
    Hu, Zhiqiang
    Ji, Yuanfeng
    Luo, Ping
    Zhang, Shaoting
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 302 - 312