Hierarchical Reinforcement Learning Based on Continuous Subgoal Space

被引:0
|
作者
Wang, Chen [1 ]
Zeng, Fanyu [1 ]
Ge, Shuzhi Sam [1 ,2 ]
Jiang, Xin [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 119077, Singapore
[3] Harbin Inst Technol, Sch Mech Engn & Automat, Shenzhen 518055, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
10.1109/rcar49640.2020.9303280
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-agents are designed at different temporal levels to decomposed a complex task into several simple ones in hierarchical reinforcement learning. At the beginning of neural network training, the great changes of the low-level policy would cause the unstable transitions of the high-level one. In this paper, we propose a hierarchical policy combined with PPO and DDPG to deal with the simultaneous training of multi-agents. To create an end-to-end policy, neural networks are employed to extract scene features in both low-level and high-level policies. In the meanwhile, a novel internal reward function is designed to enhance the goal achieving ability of low-level policy. A lightweight and fast gridworld Gym environment, MiniGrid, is used to test its validity. We found that the hierarchical policy is able to explore and plan without dense rewards. This attribute has a considerable influence on the study of robot navigation, especially in large and complex environment.
引用
收藏
页码:74 / 80
页数:7
相关论文
共 50 条
  • [1] Connect-based subgoal discovery for options in hierarchical reinforcement learning
    Chen, Fei
    Gao, Yang
    Chen, Shifu
    Ma, Zhenduo
    [J]. ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2007, : 698 - +
  • [2] Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning
    Kim, Junsu
    Seo, Younggyo
    Shin, Jinwoo
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning
    Li, Ruijia
    Cai, Zhiling
    Huang, Tianyi
    Zhu, William
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 225
  • [4] End-to-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery
    Pateria, Shubham
    Subagdja, Budhitama
    Tan, Ah-Hwee
    Quek, Chai
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7778 - 7790
  • [5] Prediction based segmentation of state space and application to a subgoal finding problem in reinforcement learning
    Nagata, Y
    Ohigashi, Y
    Takahashi, H
    Ishikawa, S
    Omori, T
    Morikawa, K
    [J]. SICE 2004 ANNUAL CONFERENCE, VOLS 1-3, 2004, : 2560 - 2565
  • [6] Hierarchical reinforcement learning from imperfect demonstrations through reachable coverage-based subgoal filtering
    Tang, Yu
    Guo, Shangqi
    Liu, Jinhui
    Wan, Bo
    An, Lingling
    Liu, Jian K.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [7] Induction of Subgoal Automata for Reinforcement Learning
    Furelos-Blanco, Daniel
    Law, Mark
    Jonsson, Anders
    Broda, Krysia
    Russo, Alessandra
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3890 - 3897
  • [8] Reinforcement learning transfer based on subgoal discovery and subtask similarity
    Wang, Hao
    Fan, Shunguo
    Song, Jinhua
    Gao, Yang
    Chen, Xingguo
    [J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1 (03) : 257 - 266
  • [9] Reinforcement Learning Transfer Based on Subgoal Discovery and Subtask Similarity
    Hao Wang
    Shunguo Fan
    Jinhua Song
    Yang Gao
    Xingguo Chen
    [J]. IEEE/CAA Journal of Automatica Sinica, 2014, 1 (03) : 257 - 266
  • [10] Induction and Exploitation of Subgoal Automata for Reinforcement Learning
    Furelos-Blanco, Daniel
    Law, Mark
    Jonsson, Anders
    Broda, Krysia
    Russo, Alessandra
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2021, 70 : 1031 - 1116