Temporal capsule networks for video motion estimation and error concealment

被引：0

作者：

Arun Sankisa

Arjun Punjabi

Aggelos K. Katsaggelos

机构：

[1] Northwestern University,Department of Electrical and Computer Engineering

来源：

Signal, Image and Video Processing | 2020年 / 14卷

关键词：

Capsule networks; Conv3D; ConvLSTM; Error concealment; Motion estimation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper, we present a temporal capsule network architecture to encode motion in videos as an instantiation parameter. The extracted motion is used to perform motion-compensated error concealment. We modify the original architecture and use a carefully curated dataset to enable the training of capsules spatially and temporally. First, we add the temporal dimension by taking co-located “patches” from three consecutive frames obtained from standard video sequences to form input data “cubes.” Second, the network is designed with an initial feature extraction layer that operates on all three dimensions to generate spatiotemporal features. Additionally, we implement the PrimaryCaps module with a recurrent layer, instead of a conventional convolutional layer, to extract short-term motion-related temporal dependencies and encode them as activation vectors in the capsule output. Finally, the capsule output is combined with the most-recent past frame and passed through a fully connected reconstruction network to perform motion-compensated error concealment. We study the effectiveness of temporal capsules by comparing the proposed model with architectures that do not include capsules. Although the quality of the reconstruction shows room for improvement, we successfully demonstrate that capsules-based architectures can be designed to operate in the temporal dimension to encode motion-related attributes as instantiation parameters. The accuracy of motion estimation is evaluated by comparing both the reconstructed frame outputs and the corresponding optical flow estimates with ground truth data.

引用

页码：1369 / 1377

页数：8

共 50 条

[1] Temporal capsule networks for video motion estimation and error concealment
Sankisa, Arun
Punjabi, Arjun
Katsaggelos, Aggelos K.
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (07) : 1369 - 1377
[2] Video error concealment using decoder motion vector estimation
Zhang, J
Arnold, JF
Frater, MR
Pickering, MR
[J]. IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 777 - 780
[3] Decoder motion vector estimation for scalable video error concealment
Yang, RD
Brown, MS
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1739 - 1742
[4] Temporal error concealment for video transmission
Chong, TS
Au, OC
Chau, WS
Chan, TW
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1363 - 1366
[5] Adaptive temporal error concealment method based on the MB behavior estimation in the video
Zabihi, Seyyed Mohammad
Ghanei-Yakhdan, Hossein
Mehrshad, Naser
[J]. PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 193 - 198
[6] An effective Video Temporal Error Concealment Method
Ruan Ruolin
Hu Ruimin
Li Zhongming
[J]. ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 683 - +
[7] Temporal Shape Error Concealment for Video Objects
于烨
谢旭东
陆建华
郑君里
陈长文
[J]. Journal of Beijing Institute of Technology, 2008, (03) : 322 - 329
[8] Efficient temporal error concealment based on motion estimation of enlarged block size
Jang, SK
Ra, JB
[J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2002, PTS 1 AND 2, 2002, 4671 : 11 - 18
[9] Efficient error localization and temporal concealment based on motion estimation of enlarged block
Jang, SK
Ra, JB
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2003, 14 (04) : 526 - 542
[10] Motion field interpolation for temporal error concealment
Al-Mualla, ME
Canagarajah, CN
Bull, DR
[J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2000, 147 (05): : 445 - 453

← 1 2 3 4 5 →