Compositional Reinforcement Learning from Logical Specifications

被引：0

作者：

Jothimurugan, Kishor ^{[1
]}

Bansal, Suguman ^{[1
]}

Bastani, Osbert ^{[1
]}

Alur, Rajeev ^{[1
]}

机构：

[1] Univ Penn, Philadelphia, PA 19104 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DIRL, that interleaves high-level planning and reinforcement learning. First, DIRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.

引用

页数：14

共 50 条

[1] Compositional Models for Reinforcement Learning
Jong, Nicholas K.
Stone, Peter
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 644 - 659
[2] Compositional Transfer in Hierarchical Reinforcement Learning
Wulfmeier, Markus
Abdolmaleki, Abbas
Hafner, Roland
Springenberg, Jost Tobias
Neunert, Michael
Hertweck, Tim
Lampe, Thomas
Siegel, Noah
Heess, Nicolas
Riedmiller, Martin
[J]. ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
[3] COMPOSUITE: A COMPOSITIONAL REINFORCEMENT LEARNING BENCHMARK
Mendez, Jorge A.
Hussing, Marcel
Gummadi, Meghna
Eaton, Eric
[J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
[4] LogiLaw Dataset Towards Reinforcement Learning from Logical Feedback (RLLF)
Ha-Thanh Nguyen
Fungwacharakorn, Wachara
Satoh, Ken
[J]. LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 379 : 217 - 226
[5] Enhancing Deep Reinforcement Learning with Executable Specifications
Yerushalmi, Raz
[J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 213 - 217
[6] LOGICAL DEVELOPMENT OF ELECTRONIC CIRCUITS FROM SPECIFICATIONS
REDDY, MA
[J]. IEEE TRANSACTIONS ON EDUCATION, 1975, 18 (03) : 149 - 154
[7] Efficient Reinforcement Learning with Generalized-Reactivity Specifications
Zhu, Chenyang
Cai, Yujie
Hu, Can
Bi, Jia
[J]. 2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 31 - 40
[8] Are Logical Languages Compositional?
Marcus Kracht
[J]. Studia Logica, 2013, 101 : 1319 - 1340
[9] Are Logical Languages Compositional?
Kracht, Marcus
[J]. STUDIA LOGICA, 2013, 101 (06) : 1319 - 1340
[10] Bounding the Optimal Value Function in Compositional Reinforcement Learning
Adamczyk, Jacob
Makarenko, Volodymyr
Arriojas, Argenis
Tiomkin, Stas
Kulkarni, Rahul V.
[J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 22 - 32

← 1 2 3 4 5 →