Multigrid Reinforcement Learning with Reward Shaping

被引:0
|
作者
Grzes, Marek [1 ]
Kudenko, Daniel [1 ]
机构
[1] Univ York, Dept Comp Sci, York YO10 5DD, N Yorkshire, England
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Potential-based reward shaping has been shown to be a powerful method to improve the convergence rate of reinforcement learning It is a flexible technique to incorporate background knowledge into temporal-difference learning in a principled way. However, the question remains how to compute the potential which is used to shape the reward that is given to the learning agent. In this paper we propose a way to solve this problem in reinforcement learning with state space discretion. In particular, we show that the potential function can be learned online in parallel with the actual reinforcement learning process. If the Q-function is learned for states determined by a given grid, a V-functional for states with lower resolution can be learned in parallel and used to approximate the potential for ground learning. The novel algorithm is presented and experimentally evaluated.
引用
收藏
页码:357 / 366
页数:10
相关论文
共 50 条
  • [1] Belief Reward Shaping in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3762 - 3769
  • [2] Reward Shaping in Episodic Reinforcement Learning
    Grzes, Marek
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 565 - 573
  • [3] Reward Shaping for Reinforcement Learning by Emotion Expressions
    Hwang, K. S.
    Ling, J. L.
    Chen, Yu-Ying
    Wang, Wei-Han
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 1288 - 1293
  • [4] Hindsight Reward Shaping in Deep Reinforcement Learning
    de Villiers, Byron
    Sabatta, Deon
    [J]. 2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 653 - 659
  • [5] Reward Shaping Based Federated Reinforcement Learning
    Hu, Yiqiu
    Hua, Yun
    Liu, Wenyan
    Zhu, Jun
    [J]. IEEE ACCESS, 2021, 9 : 67259 - 67267
  • [6] Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes
    Yang, Yulong
    Cao, Weihua
    Guo, Linwei
    Gan, Chao
    Wu, Min
    [J]. 2023 IEEE 6TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS, 2023,
  • [7] Using Natural Language for Reward Shaping in Reinforcement Learning
    Goyal, Prasoon
    Niekum, Scott
    Mooney, Raymond J.
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2385 - 2391
  • [8] Plan-based Reward Shaping for Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. 2008 4TH INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 416 - 423
  • [9] Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 337 - 344
  • [10] Reinforcement online learning to rank with unbiased reward shaping
    Zhuang, Shengyao
    Qiao, Zhihao
    Zuccon, Guido
    [J]. INFORMATION RETRIEVAL JOURNAL, 2022, 25 (04): : 386 - 413