Clockwork Variational Autoencoders

被引：0

作者：

Saxena, Vaibhav ^{[1
]}

Ba, Jimmy ^{[1
]}

Hafner, Danijar ^{[1
,2
]}

机构：

[1] Univ Toronto, Toronto, ON, Canada

[2] Google Res, Brain Team, Mountain View, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succeed at generating sharp images, they tend to fail at accurately predicting far into the future. We introduce the Clockwork VAE (CW-VAE), a video prediction model that leverages a hierarchy of latent sequences, where higher levels tick at slower intervals. We demonstrate the benefits of both hierarchical latents and temporal abstraction on 4 diverse video prediction datasets with sequences of up to 1000 frames, where CW-VAE outperforms top video prediction models. Additionally, we propose a Minecraft benchmark for long-term video prediction. We conduct several experiments to gain insights into CW-VAE and confirm that slower levels learn to represent objects that change more slowly in the video, and faster levels learn to represent faster objects.1

引用

页数：12

共 50 条

[1] Mixture variational autoencoders
Jiang, Shuoran
Chen, Yarui
Yang, Jucheng
Zhang, Chuanlei
Zhao, Tingting
[J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 263 - 269
[2] An Introduction to Variational Autoencoders
Kingma, Diederik P.
Welling, Max
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (04): : 4 - 89
[3] Mixtures of Variational Autoencoders
Ye, Fei
Bors, Adrian G.
[J]. 2020 TENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2020,
[4] Subitizing with Variational Autoencoders
Wever, Rijnder
Runia, Tom F. H.
[J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 617 - 627
[5] Variational Laplace Autoencoders
Park, Yookoon
Kim, Chris Dongjoo
Kim, Gunhee
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[6] Diffusion Variational Autoencoders
Rey, Luis A. Perez
Menkovski, Vlado
Portegies, Jim
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2704 - 2710
[7] Overdispersed Variational Autoencoders
Shah, Harshil
Barber, David
Botev, Aleksandar
[J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1109 - 1116
[8] Ladder Variational Autoencoders
Sonderby, Casper Kaae
Raiko, Tapani
Maaloe, Lars
Sonderby, Soren Kaae
Winther, Ole
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[9] Tree Variational Autoencoders
Manduchi, Laura
Vandenhirtz, Moritz
Ryser, Alain
Vogt, Julia E.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[10] Affine Variational Autoencoders
Bidart, Rene
Wong, Alexander
[J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2019, PT I, 2019, 11662 : 461 - 472

← 1 2 3 4 5 →