Statistics and Samples in Distributional Reinforcement Learning

被引：0

作者：

Rowland, Mark ^{[1
]}

Dadashi, Robert ^{[2
]}

Kumar, Saurabh ^{[2
]}

Munos, Remi ^{[1
]}

Bellemare, Marc G. ^{[2
]}

Dabney, Will ^{[1
]}

机构：

[1] DeepMind, London, England

[2] Google Brain, Mountain View, CA USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be decomposed as the combination of some statistical estimator and a method for imputing a return distribution consistent with that set of statistics. With this new understanding, we are able to provide improved analyses of existing DRL algorithms as well as construct a new algorithm (EDRL) based upon estimation of the expectiles of the return distribution. We compare EDRL with existing methods on a variety of MDPs to illustrate concrete aspects of our analysis, and develop a deep RL variant of the algorithm, ER-DQN, which we evaluate on the Atari-57 suite of games.

引用

页数：9

共 50 条

[1] Distributional Reinforcement Learning with Ensembles
Lindenberg, Bjorn
Nordqvist, Jonas
Lindahl, Karl-Olof
ALGORITHMS, 2020, 13 (05)
[2] A Distributional Perspective on Reinforcement Learning
Bellemare, Marc G.
Dabney, Will
Munos, Remi
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[3] Distributional Reinforcement Learning in the Brain
Lowet, Adam S.
Zheng, Qiao
Matias, Sara
Drugowitsch, Jan
Uchida, Naoshige
TRENDS IN NEUROSCIENCES, 2020, 43 (12) : 980 - 997
[4] Exploration by Distributional Reinforcement Learning
Tang, Yunhao
Agrawal, Shipra
PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2710 - 2716
[5] Implicit Distributional Reinforcement Learning
Yue, Yuguang
Wang, Zhendong
Zhou, Mingyuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[6] Distributional Reinforcement Learning for Efficient Exploration
Mavrin, Borislav
Yao, Hengshuai
Kong, Linglong
Wu, Kaiwen
Yu, Yaoliang
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[7] Conservative Offline Distributional Reinforcement Learning
Ma, Yecheng Jason
Jayaraman, Dinesh
Bastani, Osbert
Advances in Neural Information Processing Systems, 2021, 23 : 19235 - 19247
[8] Distributional Reinforcement Learning with Quantile Regression
Dabney, Will
Rowland, Mark
Bellemare, Marc G.
Munos, Remi
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2892 - 2901
[9] Distributional reinforcement learning in prefrontal cortex
Timothy H. Muller
James L. Butler
Sebastijan Veselic
Bruno Miranda
Joni D. Wallis
Peter Dayan
Timothy E. J. Behrens
Zeb Kurth-Nelson
Steven W. Kennerley
Nature Neuroscience, 2024, 27 : 403 - 408
[10] Conservative Offline Distributional Reinforcement Learning
Ma, Yecheng Jason
Jayaraman, Dinesh
Bastani, Osbert
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →