Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models

被引：0

作者：

Ghadirzadeh, Ali ^{[1
]}

Poklukar, Petra ^{[2
]}

Arndt, Karol ^{[3
]}

Finn, Chelsea ^{[1
]}

Kyrki, Ville ^{[3
]}

Kragic, Danica ^{[2
]}

Bjorkman, Marten ^{[2
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] KTH Royal Inst Technol, Stockholm, Sweden

[3] Aalto Univ, Espoo, Finland

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2022年 / 23卷

关键词：

reinforcement learning; policy search; robot learning; deep generative models; representation learning; PRIMITIVES;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable genera-tive models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basket-ball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.

引用

页数：37

共 50 条

[31] Example-guided learning of stochastic human driving policies using deep reinforcement learning
Emuna, Ran
Duffney, Rotem
Borowsky, Avinoam
Biess, Armin
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (23): : 16791 - 16804
[32] LFQ: Online Learning of Per-flow Queuing Policies using Deep Reinforcement Learning
Bachl, Maximilian
Fabini, Joachim
Zseby, Tanja
PROCEEDINGS OF THE 2020 IEEE 45TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN 2020), 2020, : 417 - 420
[33] Example-guided learning of stochastic human driving policies using deep reinforcement learning
Ran Emuna
Rotem Duffney
Avinoam Borowsky
Armin Biess
Neural Computing and Applications, 2023, 35 : 16791 - 16804
[34] Boosting Deep Reinforcement Learning Agents with Generative Data Augmentation
Papagiannis, Tasos
Alexandridis, Georgios
Stafylopatis, Andreas
APPLIED SCIENCES-BASEL, 2024, 14 (01):
[35] Reinforcement Learning with Deep Energy-Based Policies
Haarnoja, Tuomas
Tang, Haoran
Abbeel, Pieter
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[36] Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Korenkevych, Dmytro
Mahmood, A. Rupam
Vasan, Gautham
Bergstra, James
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2754 - 2762
[37] The State of Sparse Training in Deep Reinforcement Learning
Graesser, Laura
Evci, Utku
Elsen, Erich
Castro, Pablo Samuel
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[38] Counterfactual state explanations for reinforcement learning agents via generative deep learning
Olson, Matthew L.
Khanna, Roli
Neal, Lawrence
Li, Fuxin
Wong, Weng-Keen
ARTIFICIAL INTELLIGENCE, 2021, 295
[39] Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Chua, Kurtland
Calandra, Roberto
McAllister, Rowan
Levine, Sergey
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[40] Dual Control by Reinforcement Learning Using Deep Hyperstate Transition Models *
Rosdahl, Christian
Cervin, Anton
Bernhardsson, Bo
IFAC PAPERSONLINE, 2022, 55 (12): : 395 - 401

← 1 2 3 4 5 →