Statistics and Samples in Distributional Reinforcement Learning

被引:0
|
作者
Rowland, Mark [1 ]
Dadashi, Robert [2 ]
Kumar, Saurabh [2 ]
Munos, Remi [1 ]
Bellemare, Marc G. [2 ]
Dabney, Will [1 ]
机构
[1] DeepMind, London, England
[2] Google Brain, Mountain View, CA USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be decomposed as the combination of some statistical estimator and a method for imputing a return distribution consistent with that set of statistics. With this new understanding, we are able to provide improved analyses of existing DRL algorithms as well as construct a new algorithm (EDRL) based upon estimation of the expectiles of the return distribution. We compare EDRL with existing methods on a variety of MDPs to illustrate concrete aspects of our analysis, and develop a deep RL variant of the algorithm, ER-DQN, which we evaluate on the Atari-57 suite of games.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Distributional Reinforcement Learning with Ensembles
    Lindenberg, Bjorn
    Nordqvist, Jonas
    Lindahl, Karl-Olof
    ALGORITHMS, 2020, 13 (05)
  • [2] A Distributional Perspective on Reinforcement Learning
    Bellemare, Marc G.
    Dabney, Will
    Munos, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [3] Distributional Reinforcement Learning in the Brain
    Lowet, Adam S.
    Zheng, Qiao
    Matias, Sara
    Drugowitsch, Jan
    Uchida, Naoshige
    TRENDS IN NEUROSCIENCES, 2020, 43 (12) : 980 - 997
  • [4] Exploration by Distributional Reinforcement Learning
    Tang, Yunhao
    Agrawal, Shipra
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2710 - 2716
  • [5] Implicit Distributional Reinforcement Learning
    Yue, Yuguang
    Wang, Zhendong
    Zhou, Mingyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] Distributional Reinforcement Learning for Efficient Exploration
    Mavrin, Borislav
    Yao, Hengshuai
    Kong, Linglong
    Wu, Kaiwen
    Yu, Yaoliang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Conservative Offline Distributional Reinforcement Learning
    Ma, Yecheng Jason
    Jayaraman, Dinesh
    Bastani, Osbert
    Advances in Neural Information Processing Systems, 2021, 23 : 19235 - 19247
  • [8] Distributional Reinforcement Learning with Quantile Regression
    Dabney, Will
    Rowland, Mark
    Bellemare, Marc G.
    Munos, Remi
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2892 - 2901
  • [9] Distributional reinforcement learning in prefrontal cortex
    Timothy H. Muller
    James L. Butler
    Sebastijan Veselic
    Bruno Miranda
    Joni D. Wallis
    Peter Dayan
    Timothy E. J. Behrens
    Zeb Kurth-Nelson
    Steven W. Kennerley
    Nature Neuroscience, 2024, 27 : 403 - 408
  • [10] Conservative Offline Distributional Reinforcement Learning
    Ma, Yecheng Jason
    Jayaraman, Dinesh
    Bastani, Osbert
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34