Capturing the spatio-temporal continuity for video semantic segmentation

被引:3
|
作者
Chen, Xin [1 ]
Wu, Aming [1 ]
Han, Yahong [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Yaguan Rd, Tianjin, Peoples R China
关键词
feature extraction; image segmentation; image representation; video signal processing; neural nets; probability; video semantic segmentation; image semantic segmentation; convolutional neural network; image segmentation algorithms; video frame; temporal region continuity inherent; videos; deep neural network architecture; newly devised spatio-temporal continuity; encoding network; STC module; decoding network; high-level feature map; STC feature map; current feature representation; consecutive video frames; segmentation result;
D O I
10.1049/iet-ipr.2018.6479
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, image semantic segmentation based on a convolutional neural network has achieved many advances. However, the development of video semantic segmentation is relatively slow. Directly applying the image segmentation algorithms to each video frame separately may ignore the temporal region continuity inherent in videos. In this study, the authors propose a novel deep neural network architecture with a newly devised spatio-temporal continuity (STC) module for video semantic segmentation. Particularly, the architecture includes an encoding network, an STC module, and a decoding network. The encoding network is used to extract a high-level feature map. The STC module then uses the high-level feature map as input to extract the STC feature map. For decoding, they use four dilated convolutional layers to obtain more abstract representation and a deconvolutional layer to increase the size of the representation. Finally, they fuse the current feature representation and the previous feature representation and get the class probabilities. Thus, this architecture receives a sequence of consecutive video frames and outputs the segmentation result of the current frame. They extensively evaluate the proposed approach on the CamVid and KITTI datasets. Compared with other methods, the authors' approach not only achieves competitive performance but also has lower complexity.
引用
收藏
页码:2813 / 2820
页数:8
相关论文
共 50 条
  • [1] Semantic spatio-temporal segmentation for extracting video objects
    Mao, JH
    Ma, KK
    [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 1, 1999, : 738 - 743
  • [2] A spatio-temporal network for video semantic segmentation in surgical videos
    Maria Grammatikopoulou
    Ricardo Sanchez-Matilla
    Felix Bragman
    David Owen
    Lucy Culshaw
    Karen Kerr
    Danail Stoyanov
    Imanol Luengo
    [J]. International Journal of Computer Assisted Radiology and Surgery, 2024, 19 : 375 - 382
  • [3] A spatio-temporal network for video semantic segmentation in surgical videos
    Grammatikopoulou, Maria
    Sanchez-Matilla, Ricardo
    Bragman, Felix
    Owen, David
    Culshaw, Lucy
    Kerr, Karen
    Stoyanov, Danail
    Luengo, Imanol
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 19 (2) : 375 - 382
  • [4] A spatio-temporal network for video semantic segmentation in surgical videos
    Grammatikopoulou, Maria
    Sanchez-Matilla, Ricardo
    Bragman, Felix
    Owen, David
    Culshaw, Lucy
    Kerr, Karen
    Stoyanov, Danail
    Luengo, Imanol
    [J]. arXiv, 2023,
  • [5] Learning Deep Spatio-Temporal Dependence for Semantic Video Segmentation
    Qiu, Zhaofan
    Yao, Ting
    Mei, Tao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (04) : 939 - 949
  • [6] Spatio-temporal segmentation for video surveillance
    Sun, HZ
    Tan, TN
    [J]. ELECTRONICS LETTERS, 2001, 37 (01) : 20 - 21
  • [7] Video Segmentation with Spatio-Temporal Tubes
    Trichet, Remi
    Nevatia, Ramakant
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 330 - 335
  • [8] Spatio-temporal segmentation for video surveillance
    Sun, HZ
    Feng, T
    Tan, TN
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 843 - 846
  • [9] Spatio-temporal Semantic Segmentation for Drone Detection
    Craye, Celine
    Ardjoune, Salem
    [J]. 2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
  • [10] Video Segmentation by Spatio-temporal Random Walk
    Chang, Jing
    Wang, Hui
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON E-BUSINESS, INFORMATION MANAGEMENT AND COMPUTER SCIENCE, 2018, : 54 - 58