Decomposing user-defined tasks in a reinforcement learning setup using TextWorld

被引：0

作者：

Petsanis, Thanos ^{[1
]}

Keroglou, Christoforos ^{[1
]}

Kapoutsis, Athanasios Ch. ^{[2
]}

Kosmatopoulos, Elias B. ^{[1
]}

Sirakoulis, Georgios Ch. ^{[1
]}

机构：

[1] Democritus Univ Thrace DUTH, Sch Engn, Dept Elect & Comp Engn, Xanthi, Greece

[2] Informat Technol Inst, Ctr Res & Technol, Thessaloniki, Greece

来源：

FRONTIERS IN ROBOTICS AND AI | 2023年 / 10卷

关键词：

formal methods in robotics and automation; reinforcement learning; hierarchical reinforcement learning; task and motion planning; autonomous agents; ENVIRONMENTS;

D O I：

10.3389/frobt.2023.1280578

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

The current paper proposes a hierarchical reinforcement learning (HRL) method to decompose a complex task into simpler sub-tasks and leverage those to improve the training of an autonomous agent in a simulated environment. For practical reasons (i.e., illustrating purposes, easy implementation, user-friendly interface, and useful functionalities), we employ two Python frameworks called TextWorld and MiniGrid. MiniGrid functions as a 2D simulated representation of the real environment, while TextWorld functions as a high-level abstraction of this simulated environment. Training on this abstraction disentangles manipulation from navigation actions and allows us to design a dense reward function instead of a sparse reward function for the lower-level environment, which, as we show, improves the performance of training. Formal methods are utilized throughout the paper to establish that our algorithm is not prevented from deriving solutions.

引用

页数：14

共 50 条

[1] Learning to Predict User-Defined Types
Jesse, Kevin
Devanbu, Premkumar T.
Sawant, Anand
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 1508 - 1522
[2] User-defined Machine Learning Functions
Herrmann, Markus
Fiedler, Marc
[J]. 3RD INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH METHODS AND ANALYTICS (CARMA 2020), 2020, : 337 - 337
[3] Task-Selection bias: A case for user-defined tasks
[J]. Cordes, R.E. (cordes@us.ibm.com), 1600, Bellwether Publishing, Ltd. (13):
[4] Task-selection bias: A case for user-defined tasks
Cordes, RE
[J]. INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2001, 13 (04) : 411 - 419
[5] Learning GUI Completions with User-defined Constraints
Bruckner, Lukas
Leiva, Luis A.
Oulasvirta, Antti
[J]. ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2022, 12 (01)
[6] POSTERIOR FEATURES APPLIED TO SPEECH RECOGNITION TASKS WITH USER-DEFINED VOCABULARY
Aradilla, Guillermo
Bourlard, Herve
Magimai-Doss, Mathew
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3809 - 3812
[7] A User-Defined Code Reinforcement Technology Based on LLVM-Obfuscator
Yao, Xue
Li, Bin
Sun, Yahong
[J]. ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 688 - 694
[8] Parallelizing User-Defined Aggregations using Symbolic Execution
Raychev, Veselin
Musuvathi, Madanlal
Mytkowicz, Todd
[J]. SOSP'15: PROCEEDINGS OF THE TWENTY-FIFTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, 2015, : 153 - 167
[9] A Directive Generation Approach Using User-defined Rules
Komatsu, Kazuhiko
Egawa, Ryusuke
Takizawa, Hiroyuki
Kobayashi, Hiroaki
[J]. 2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2016, : 515 - 521
[10] Graph Pattern Mining and Learning through User-defined Relations
Teixeira, Carlos H. C.
Cotta, Leonardo
Ribeiro, Bruno
Meira, Wagner, Jr.
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1266 - 1271

← 1 2 3 4 5 →