The Dreaming Variational Autoencoder for Reinforcement Learning Environments

被引:7
|
作者
Andersen, Per-Arne [1 ]
Goodwin, Morten [1 ]
Granmo, Ole-Christoffer [1 ]
机构
[1] Univ Agder, Dept ICT, Grimstad, Norway
来源
关键词
Deep reinforcement learning; Environment modeling; Neural networks; Variational autoencoder; Markov decision processes; Exploration; Artificial experience-replay;
D O I
10.1007/978-3-030-04191-5_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.
引用
收藏
页码:143 / 155
页数:13
相关论文
共 50 条
  • [1] Efficiency of Reinforcement Learning using Polarized Regime by Variational Autoencoder
    Nakai, Masato
    Shibuya, Takeshi
    [J]. 2022 61ST ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS (SICE), 2022, : 128 - 134
  • [2] Banner layout retargeting with hierarchical reinforcement learning and variational autoencoder
    Hao Hu
    Chao Zhang
    Yanxue Liang
    [J]. Multimedia Tools and Applications, 2022, 81 : 34417 - 34438
  • [3] Banner layout retargeting with hierarchical reinforcement learning and variational autoencoder
    Hu, Hao
    Zhang, Chao
    Liang, Yanxue
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34417 - 34438
  • [4] Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
    Shen, Xiangqing
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    Liu, Mingming
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 203
  • [5] VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems
    Qi Wang
    [J]. Applied Intelligence, 2022, 52 : 8910 - 8923
  • [6] VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems
    Wang, Qi
    [J]. APPLIED INTELLIGENCE, 2022, 52 (08) : 8910 - 8923
  • [7] Learning Community Structure with Variational Autoencoder
    Choong, Jun Jin
    Liu, Xin
    Murata, Tsuyoshi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 69 - 78
  • [8] The Difference Learning of Hidden Layer between Autoencoder and Variational Autoencoder
    Xu, Qingyang
    Wu, Zhe
    Yang, Yiqin
    Zhang, Li
    [J]. 2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4801 - 4804
  • [9] Predicting chemical structure using reinforcement learning with a stack-augmented conditional variational autoencoder
    Kim, Hwanhee
    Ko, Soohyun
    Kim, Byung Ju
    Ryu, Sung Jin
    Ahn, Jaegyoon
    [J]. JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
  • [10] Predicting chemical structure using reinforcement learning with a stack-augmented conditional variational autoencoder
    Hwanhee Kim
    Soohyun Ko
    Byung Ju Kim
    Sung Jin Ryu
    Jaegyoon Ahn
    [J]. Journal of Cheminformatics, 14