Clockwork Variational Autoencoders

被引:0
|
作者
Saxena, Vaibhav [1 ]
Ba, Jimmy [1 ]
Hafner, Danijar [1 ,2 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Google Res, Brain Team, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succeed at generating sharp images, they tend to fail at accurately predicting far into the future. We introduce the Clockwork VAE (CW-VAE), a video prediction model that leverages a hierarchy of latent sequences, where higher levels tick at slower intervals. We demonstrate the benefits of both hierarchical latents and temporal abstraction on 4 diverse video prediction datasets with sequences of up to 1000 frames, where CW-VAE outperforms top video prediction models. Additionally, we propose a Minecraft benchmark for long-term video prediction. We conduct several experiments to gain insights into CW-VAE and confirm that slower levels learn to represent objects that change more slowly in the video, and faster levels learn to represent faster objects.1
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Mixture variational autoencoders
    Jiang, Shuoran
    Chen, Yarui
    Yang, Jucheng
    Zhang, Chuanlei
    Zhao, Tingting
    [J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 263 - 269
  • [2] An Introduction to Variational Autoencoders
    Kingma, Diederik P.
    Welling, Max
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (04): : 4 - 89
  • [3] Mixtures of Variational Autoencoders
    Ye, Fei
    Bors, Adrian G.
    [J]. 2020 TENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2020,
  • [4] Subitizing with Variational Autoencoders
    Wever, Rijnder
    Runia, Tom F. H.
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 617 - 627
  • [5] Variational Laplace Autoencoders
    Park, Yookoon
    Kim, Chris Dongjoo
    Kim, Gunhee
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] Diffusion Variational Autoencoders
    Rey, Luis A. Perez
    Menkovski, Vlado
    Portegies, Jim
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2704 - 2710
  • [7] Overdispersed Variational Autoencoders
    Shah, Harshil
    Barber, David
    Botev, Aleksandar
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1109 - 1116
  • [8] Ladder Variational Autoencoders
    Sonderby, Casper Kaae
    Raiko, Tapani
    Maaloe, Lars
    Sonderby, Soren Kaae
    Winther, Ole
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [9] Tree Variational Autoencoders
    Manduchi, Laura
    Vandenhirtz, Moritz
    Ryser, Alain
    Vogt, Julia E.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Affine Variational Autoencoders
    Bidart, Rene
    Wong, Alexander
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2019, PT I, 2019, 11662 : 461 - 472