Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

被引:0
|
作者
Zhao, Rui [1 ,2 ]
Sun, Xudong [1 ]
Tresp, Volker [1 ,2 ]
机构
[1] Ludwig Maximilian Univ Munich, Fac Math Informat & Stat, Munich, Germany
[2] Siemens AG, Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Multi-Goal Reinforcement Learning, an agent learns to achieve multiple goals with a goal-conditioned policy. During learning, the agent first collects the trajectories into a replay buffer, and later these trajectories are selected randomly for replay. However, the achieved goals in the replay buffer are often biased towards the behavior policies. From a Bayesian perspective, when there is no prior knowledge about the target goal distribution, the agent should learn uniformly from diverse achieved goals. Therefore, we first propose a novel multi-goal RL objective based on weighted entropy. This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals. Secondly, we developed a maximum entropy-based prioritization framework to optimize the proposed objective. For evaluation of this framework, we combine it with Deep Deterministic Policy Gradient, both with or without Hindsight Experience Replay. On a set of multi-goal robotic tasks of OpenAI Gym, we compare our method with other baselines and show promising improvements in both performance and sample-efficiency.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Distributed entropy-regularized multi-agent reinforcement learning with policy consensus
    Hu, Yifan
    Fu, Junjie
    Wen, Guanghui
    Lv, Yuezu
    Ren, Wei
    [J]. AUTOMATICA, 2024, 164
  • [2] Guided goal generation for hindsight multi-goal reinforcement learning
    Bai, Chenjia
    Liu, Peng
    Zhao, Wei
    Tang, Xianglong
    [J]. NEUROCOMPUTING, 2019, 359 : 353 - 367
  • [3] Multigoal Reinforcement Learning via Exploring Entropy-Regularized Successor Matching
    Feng, Xiaoyun
    Zhou, Yun
    [J]. IEEE TRANSACTIONS ON GAMES, 2023, 15 (04) : 538 - 548
  • [4] Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning
    Adamczyk, Jacob
    Arriojas, Argenis
    Tiomkin, Stas
    Kulkarni, Rahul V.
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6658 - 6665
  • [5] Combining Hindsight with Goal-enhanced Prediction for Multi-goal Reinforcement Learning
    Yang, Rui
    Luo, Feng
    Li, Xiu
    [J]. 2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 314 - 321
  • [6] Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning
    Castanet, Nicolas
    Sigaud, Olivier
    Lamprier, Sylvain
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [7] CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
    Colas, Cedric
    Fournier, Pierre
    Sigaud, Olivier
    Chetouani, Mohamed
    Oudeyer, Pierre-Yves
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [8] Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning
    Cui, Kai
    Koeppl, Heinz
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [9] Entropy-regularized maximum-likelihood cluster mass reconstruction
    Seitz, S
    Schneider, P
    Bartelmann, M
    [J]. ASTRONOMY & ASTROPHYSICS, 1998, 337 (02): : 325 - 337
  • [10] Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation
    Yan, Jiangyue
    Luo, Biao
    Xu, Xiaodong
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)