Compositional Reinforcement Learning from Logical Specifications

被引:0
|
作者
Jothimurugan, Kishor [1 ]
Bansal, Suguman [1 ]
Bastani, Osbert [1 ]
Alur, Rajeev [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DIRL, that interleaves high-level planning and reinforcement learning. First, DIRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Compositional Models for Reinforcement Learning
    Jong, Nicholas K.
    Stone, Peter
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 644 - 659
  • [2] Compositional Transfer in Hierarchical Reinforcement Learning
    Wulfmeier, Markus
    Abdolmaleki, Abbas
    Hafner, Roland
    Springenberg, Jost Tobias
    Neunert, Michael
    Hertweck, Tim
    Lampe, Thomas
    Siegel, Noah
    Heess, Nicolas
    Riedmiller, Martin
    [J]. ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
  • [3] COMPOSUITE: A COMPOSITIONAL REINFORCEMENT LEARNING BENCHMARK
    Mendez, Jorge A.
    Hussing, Marcel
    Gummadi, Meghna
    Eaton, Eric
    [J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
  • [4] LogiLaw Dataset Towards Reinforcement Learning from Logical Feedback (RLLF)
    Ha-Thanh Nguyen
    Fungwacharakorn, Wachara
    Satoh, Ken
    [J]. LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 379 : 217 - 226
  • [5] Enhancing Deep Reinforcement Learning with Executable Specifications
    Yerushalmi, Raz
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 213 - 217
  • [6] LOGICAL DEVELOPMENT OF ELECTRONIC CIRCUITS FROM SPECIFICATIONS
    REDDY, MA
    [J]. IEEE TRANSACTIONS ON EDUCATION, 1975, 18 (03) : 149 - 154
  • [7] Efficient Reinforcement Learning with Generalized-Reactivity Specifications
    Zhu, Chenyang
    Cai, Yujie
    Hu, Can
    Bi, Jia
    [J]. 2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 31 - 40
  • [8] Are Logical Languages Compositional?
    Marcus Kracht
    [J]. Studia Logica, 2013, 101 : 1319 - 1340
  • [9] Are Logical Languages Compositional?
    Kracht, Marcus
    [J]. STUDIA LOGICA, 2013, 101 (06) : 1319 - 1340
  • [10] Bounding the Optimal Value Function in Compositional Reinforcement Learning
    Adamczyk, Jacob
    Makarenko, Volodymyr
    Arriojas, Argenis
    Tiomkin, Stas
    Kulkarni, Rahul V.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 22 - 32