Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable Policies

被引:0
|
作者
Esteban, Domingo [1 ,2 ]
Rozo, Leonel [3 ]
Caldwell, Darwin G. [1 ]
机构
[1] Ist Italiano Tecnol, Dept Adv Robot, Via Morego 30, I-16163 Genoa, Italy
[2] Univ Genoa, DIBRIS, Via Opera Pia 13, I-16145 Genoa, Italy
[3] Bosch Ctr Artificial Intelligence, Robert Bosch Campus 1, D-71272 Renningen, Germany
关键词
D O I
10.1109/iros40897.2019.8968149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A common strategy to deal with the expensive reinforcement learning (RL) of complex tasks is to decompose them into a collection of subtasks that are usually simpler to learn as well as reusable for new problems. However, when a robot learns the policies for these subtasks, common approaches treat every policy learning process separately. Therefore, all these individual (composable) policies need to be learned before tackling the learning process of the complex task through policies composition. Moreover, such composition of individual policies is usually performed sequentially, which is not suitable for tasks that require to perform the subtasks concurrently. In this paper, we propose to combine a set of composable Gaussian policies corresponding to these subtasks using a set of activation vectors, resulting in a complex Gaussian policy that is a function of the means and covariances matrices of the composable policies. Moreover, we propose an algorithm for learning both compound and composable policies within the same learning process by exploiting the off-policy data generated from the compound policy. The algorithm is built on a maximum entropy RL approach to favor exploration during the learning process. The results of the experiments show that the experience collected with the compound policy permits not only to solve the complex task but also to obtain useful composable policies that successfully perform in their corresponding subtasks.
引用
下载
收藏
页码:1818 / 1825
页数:8
相关论文
共 50 条
  • [21] FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning
    Koudad, Zoulikha
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2021, 30 (02)
  • [22] An agent with a sense of direction for option discovery in hierarchical reinforcement learning
    Koudad, Zoulikha
    Merzoug, Mohamed
    Benamar, Abdelkrim
    INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2024,
  • [23] Composable Deep Reinforcement Learning for Robotic Manipulation
    Haarnoja, Tuomas
    Pong, Vitchyr
    Zhou, Aurick
    Dalal, Murtaza
    Abbeel, Pieter
    Levine, Sergey
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 6244 - 6251
  • [24] A Composable Specification Language for Reinforcement Learning Tasks
    Jothimurugan, Kishor
    Alur, Rajeev
    Bastani, Osbert
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [25] End-to-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery
    Pateria, Shubham
    Subagdja, Budhitama
    Tan, Ah-Hwee
    Quek, Chai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7778 - 7790
  • [26] Connect-based subgoal discovery for options in hierarchical reinforcement learning
    Chen, Fei
    Gao, Yang
    Chen, Shifu
    Ma, Zhenduo
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2007, : 698 - +
  • [27] Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning
    Chuck, Caleb
    Chockchowwat, Supawit
    Niekum, Scott
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5572 - 5579
  • [28] A Task-Agnostic Regularizer for Diverse Subpolicy Discovery in Hierarchical Reinforcement Learning
    Huo, Liangyu
    Wang, Zulin
    Xu, Mai
    Song, Yuhang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (03): : 1932 - 1944
  • [29] Learning Curriculum Policies for Reinforcement Learning
    Narvekar, Sanmit
    Stone, Peter
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 25 - 33
  • [30] Hierarchical reinforcement learning with OMQ
    Shen, Jing
    Liu, Haibo
    Gu, Guochang
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 584 - 588