Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models

被引：0

作者：

Ghadirzadeh, Ali ^{[1
]}

Poklukar, Petra ^{[2
]}

Arndt, Karol ^{[3
]}

Finn, Chelsea ^{[1
]}

Kyrki, Ville ^{[3
]}

Kragic, Danica ^{[2
]}

Bjorkman, Marten ^{[2
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] KTH Royal Inst Technol, Stockholm, Sweden

[3] Aalto Univ, Espoo, Finland

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2022年 / 23卷

关键词：

reinforcement learning; policy search; robot learning; deep generative models; representation learning; PRIMITIVES;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable genera-tive models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basket-ball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.

引用

页数：37

共 50 条

[1] Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models
Ghadirzadeh, Ali
Poklukar, Petra
Arndt, Karol
Finn, Chelsea
Kyrki, Ville
Kragic, Danica
Björkman, Mårten
Journal of Machine Learning Research, 2022, 23
[2] Learning Urban Driving Policies using Deep Reinforcement Learning
Agarwal, Tanmay
Arora, Hitesh
Schneider, Jeff
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 607 - 614
[3] Distributed Training for Deep Learning Models On An Edge Computing Network Using Shielded Reinforcement Learning
Sen, Tanmoy
Shen, Haiying
2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 581 - 591
[4] Integrating Multiple Policies for Person-Following Robot Training Using Deep Reinforcement Learning
Dewa, Chandra Kusuma
Miura, Jun
IEEE ACCESS, 2021, 9 : 75526 - 75541
[5] Learning Deep Generative Models
Salakhutdinov, Ruslan
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 2, 2015, 2 : 361 - 385
[6] Enhancing Deep Reinforcement Learning: A Tutorial on Generative Diffusion Models in Network Optimization
Du, Hongyang
Zhang, Ruichen
Liu, Yinqiu
Wang, Jiacheng
Lin, Yijing
Li, Zonghang
Niyato, Dusit
Kang, Jiawen
Xiong, Zehui
Cui, Shuguang
Ai, Bo
Zhou, Haibo
Kim, Dong In
IEEE Communications Surveys and Tutorials, 2024, 26 (04): : 2611 - 2646
[7] De Novo Drug Design Using Reinforcement Learning with Graph- Based Deep Generative Models
Atance, Sara Romeo
Diez, Juan Viguera
Engkvist, Ola
Olsson, Simon
Mercado, Rocio
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (20) : 4863 - 4872
[8] On Training Flexible Robots using Deep Reinforcement Learning
Dwiel, Zach
Candadai, Madhavun
Phielipp, Mariano
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 4666 - 4671
[9] Deep Predictive Policy Training using Reinforcement Learning
Ghadirzadeh, Ali
Maki, Atsuto
Kragic, Danica
Bjorkman, Marten
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2351 - 2358
[10] Shaping Rewards for Reinforcement Learning with Imperfect Demonstrations using Generative Models
Wu, Yuchen
Mozifian, Melissa
Shkurti, Florian
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 6628 - 6634

← 1 2 3 4 5 →