Learning Dynamics and Generalization in Deep Reinforcement Learning

被引：0

作者：

Lyle, Clare ^{[1
]}

Rowland, Mark ^{[2
]}

Dabney, Will ^{[2
]}

Kwiatkowksa, Marta ^{[1
]}

Gal, Yarin ^{[1
]}

机构：

[1] Univ Oxford, Dept Comp Sci, Oxford, England

[2] DeepMind, London, England

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

基金：

欧盟地平线“2020”;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Solving a reinforcement learning (RL) problem poses two competing challenges: fitting a potentially discontinuous value function, and generalizing well to new observations. In this paper, we analyze the learning dynamics of temporal difference algorithms to gain novel insight into the tension between these two objectives. We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training, and at the same time induces the second-order effect of discouraging generalization. We corroborate these findings in deep RL agents trained on a range of environments, finding that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly initialized networks and networks trained with policy gradient methods. Finally, we investigate how post-training policy distillation may avoid this pitfall, and show that this approach improves generalization to novel environments in the ProcGen suite and improves robustness to input perturbations.

引用

页数：22

共 50 条

[1] Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
Hanjie, Austin W.
Zhong, Victor
Narasimhan, Karthik
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[2] Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning
Seo, Younggyo
Lee, Kimin
Clavera, Ignasi
Kurutach, Thanard
Shin, Jinwoo
Abbeel, Pieter
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[3] Metrics for Assessing Generalization of Deep Reinforcement Learning in Parameterized Environments
Aleksandrowicz, Maciej
Jaworek-Korjakowska, Joanna
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2023, 14 (01) : 45 - 61
[4] Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
Miranda, Victor R. F.
Neto, Armando A.
Freitas, Gustavo M.
Mozelli, Leonardo A.
[J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, 71 (06) : 6013 - 6020
[5] Learning and Generalization of Dynamic Movement Primitives by Hierarchical Deep Reinforcement Learning from Demonstration
Kim, Wonchul
Lee, Chungkeun
Kim, H. Jin
[J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 3117 - 3123
[6] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Morales, Eduardo F.
Murrieta-Cid, Rafael
Becerra, Israel
Esquivel-Basaldua, Marco A.
[J]. INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
[7] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
Eduardo F. Morales
Rafael Murrieta-Cid
Israel Becerra
Marco A. Esquivel-Basaldua
[J]. Intelligent Service Robotics, 2021, 14 : 773 - 805
[8] On the Generalization of Representations in Reinforcement Learning
Le Lan, Charline
Tu, Stephen
Oberman, Adam
Agarwal, Rishabh
Bellemare, Marc
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[9] Quantifying Generalization in Reinforcement Learning
Cobbe, Karl
Klimov, Oleg
Hesse, Chris
Kim, Taehoon
Schulman, John
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[10] The Advance of Reinforcement Learning and Deep Reinforcement Learning
Lyu, Le
Shen, Yang
Zhang, Sicheng
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 644 - 648

← 1 2 3 4 5 →