The Dreaming Variational Autoencoder for Reinforcement Learning Environments

被引：7

作者：

Andersen, Per-Arne ^{[1
]}

Goodwin, Morten ^{[1
]}

Granmo, Ole-Christoffer ^{[1
]}

机构：

[1] Univ Agder, Dept ICT, Grimstad, Norway

来源：

ARTIFICIAL INTELLIGENCE XXXV (AI 2018) | 2018年 / 11311卷

关键词：

Deep reinforcement learning; Environment modeling; Neural networks; Variational autoencoder; Markov decision processes; Exploration; Artificial experience-replay;

D O I：

10.1007/978-3-030-04191-5_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.

引用

页码：143 / 155

页数：13

共 50 条

[1] Efficiency of Reinforcement Learning using Polarized Regime by Variational Autoencoder
Nakai, Masato
Shibuya, Takeshi
[J]. 2022 61ST ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS (SICE), 2022, : 128 - 134
[2] Banner layout retargeting with hierarchical reinforcement learning and variational autoencoder
Hao Hu
Chao Zhang
Yanxue Liang
[J]. Multimedia Tools and Applications, 2022, 81 : 34417 - 34438
[3] Banner layout retargeting with hierarchical reinforcement learning and variational autoencoder
Hu, Hao
Zhang, Chao
Liang, Yanxue
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34417 - 34438
[4] Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
Shen, Xiangqing
Liu, Bing
Zhou, Yong
Zhao, Jiaqi
Liu, Mingming
[J]. KNOWLEDGE-BASED SYSTEMS, 2020, 203
[5] VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems
Qi Wang
[J]. Applied Intelligence, 2022, 52 : 8910 - 8923
[6] VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems
Wang, Qi
[J]. APPLIED INTELLIGENCE, 2022, 52 (08) : 8910 - 8923
[7] Learning Community Structure with Variational Autoencoder
Choong, Jun Jin
Liu, Xin
Murata, Tsuyoshi
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 69 - 78
[8] The Difference Learning of Hidden Layer between Autoencoder and Variational Autoencoder
Xu, Qingyang
Wu, Zhe
Yang, Yiqin
Zhang, Li
[J]. 2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 4801 - 4804
[9] Predicting chemical structure using reinforcement learning with a stack-augmented conditional variational autoencoder
Kim, Hwanhee
Ko, Soohyun
Kim, Byung Ju
Ryu, Sung Jin
Ahn, Jaegyoon
[J]. JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
[10] Predicting chemical structure using reinforcement learning with a stack-augmented conditional variational autoencoder
Hwanhee Kim
Soohyun Ko
Byung Ju Kim
Sung Jin Ryu
Jaegyoon Ahn
[J]. Journal of Cheminformatics, 14

← 1 2 3 4 5 →