Potential Based Reward Shaping for Hierarchical Reinforcement Learning

被引:0
|
作者
Gao, Yang [1 ]
Toni, Francesca [1 ]
机构
[1] Imperial Coll London, Dept Comp, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical Reinforcement Learning (HRL) out-performs many 'flat' Reinforcement Learning (RL) algorithms in some application domains. However, HRL may need longer time to obtain the optimal policy because of its large action space. Potential Based Reward Shaping (PBRS) has been widely used to incorporate heuristics into flat RL algorithms so as to reduce their exploration. In this paper, we investigate the integration of PBRS and HRL, and propose a new algorithm: PBRS-MAXQ-0. We prove that under certain conditions, PBRSMAXQ-0 is guaranteed to converge. Empirical results show that PBRS-MAXQ-0 significantly outperforms MAXQ-0 given good heuristics, and can converge even when given misleading heuristics.
引用
收藏
页码:3504 / 3510
页数:7
相关论文
共 50 条
  • [1] A new Potential-Based Reward Shaping for Reinforcement Learning Agent
    Badnava, Babak
    Esmaeili, Mona
    Mozayani, Nasser
    Zarkesh-Ha, Payman
    [J]. 2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, : 630 - 635
  • [2] Reward Shaping Based Federated Reinforcement Learning
    Hu, Yiqiu
    Hua, Yun
    Liu, Wenyan
    Zhu, Jun
    [J]. IEEE ACCESS, 2021, 9 : 67259 - 67267
  • [3] Hierarchical Reinforcement Learning from Demonstration via Reachability-Based Reward Shaping
    Gao, Xiaozhu
    Liu, Jinhui
    Wan, Bo
    An, Lingling
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (03)
  • [4] Plan-based Reward Shaping for Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. 2008 4TH INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 416 - 423
  • [5] Belief Reward Shaping in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3762 - 3769
  • [6] Multigrid Reinforcement Learning with Reward Shaping
    Grzes, Marek
    Kudenko, Daniel
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 357 - 366
  • [7] Reward Shaping in Episodic Reinforcement Learning
    Grzes, Marek
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 565 - 573
  • [8] Landmark Based Reward Shaping in Reinforcement Learning with Hidden States
    Demir, Alper
    Cilden, Erkin
    Polat, Faruk
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1922 - 1924
  • [9] Reward Shaping for Model-Based Bayesian Reinforcement Learning
    Kim, Hyeoneun
    Lim, Woosang
    Lee, Kanghoon
    Noh, Yung-Kyun
    Kim, Kee-Eung
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3548 - 3555
  • [10] Hierarchical average reward reinforcement learning
    Ghavamzadeh, Mohammad
    Mahadevan, Sridhar
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 2629 - 2669