Decomposing user-defined tasks in a reinforcement learning setup using TextWorld

被引:0
|
作者
Petsanis, Thanos [1 ]
Keroglou, Christoforos [1 ]
Kapoutsis, Athanasios Ch. [2 ]
Kosmatopoulos, Elias B. [1 ]
Sirakoulis, Georgios Ch. [1 ]
机构
[1] Democritus Univ Thrace DUTH, Sch Engn, Dept Elect & Comp Engn, Xanthi, Greece
[2] Informat Technol Inst, Ctr Res & Technol, Thessaloniki, Greece
来源
关键词
formal methods in robotics and automation; reinforcement learning; hierarchical reinforcement learning; task and motion planning; autonomous agents; ENVIRONMENTS;
D O I
10.3389/frobt.2023.1280578
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The current paper proposes a hierarchical reinforcement learning (HRL) method to decompose a complex task into simpler sub-tasks and leverage those to improve the training of an autonomous agent in a simulated environment. For practical reasons (i.e., illustrating purposes, easy implementation, user-friendly interface, and useful functionalities), we employ two Python frameworks called TextWorld and MiniGrid. MiniGrid functions as a 2D simulated representation of the real environment, while TextWorld functions as a high-level abstraction of this simulated environment. Training on this abstraction disentangles manipulation from navigation actions and allows us to design a dense reward function instead of a sparse reward function for the lower-level environment, which, as we show, improves the performance of training. Formal methods are utilized throughout the paper to establish that our algorithm is not prevented from deriving solutions.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Learning to Predict User-Defined Types
    Jesse, Kevin
    Devanbu, Premkumar T.
    Sawant, Anand
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 1508 - 1522
  • [2] User-defined Machine Learning Functions
    Herrmann, Markus
    Fiedler, Marc
    [J]. 3RD INTERNATIONAL CONFERENCE ON ADVANCED RESEARCH METHODS AND ANALYTICS (CARMA 2020), 2020, : 337 - 337
  • [3] Task-Selection bias: A case for user-defined tasks
    [J]. Cordes, R.E. (cordes@us.ibm.com), 1600, Bellwether Publishing, Ltd. (13):
  • [4] Task-selection bias: A case for user-defined tasks
    Cordes, RE
    [J]. INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2001, 13 (04) : 411 - 419
  • [5] Learning GUI Completions with User-defined Constraints
    Bruckner, Lukas
    Leiva, Luis A.
    Oulasvirta, Antti
    [J]. ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2022, 12 (01)
  • [6] POSTERIOR FEATURES APPLIED TO SPEECH RECOGNITION TASKS WITH USER-DEFINED VOCABULARY
    Aradilla, Guillermo
    Bourlard, Herve
    Magimai-Doss, Mathew
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3809 - 3812
  • [7] A User-Defined Code Reinforcement Technology Based on LLVM-Obfuscator
    Yao, Xue
    Li, Bin
    Sun, Yahong
    [J]. ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 688 - 694
  • [8] Parallelizing User-Defined Aggregations using Symbolic Execution
    Raychev, Veselin
    Musuvathi, Madanlal
    Mytkowicz, Todd
    [J]. SOSP'15: PROCEEDINGS OF THE TWENTY-FIFTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, 2015, : 153 - 167
  • [9] A Directive Generation Approach Using User-defined Rules
    Komatsu, Kazuhiko
    Egawa, Ryusuke
    Takizawa, Hiroyuki
    Kobayashi, Hiroaki
    [J]. 2016 FOURTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2016, : 515 - 521
  • [10] Graph Pattern Mining and Learning through User-defined Relations
    Teixeira, Carlos H. C.
    Cotta, Leonardo
    Ribeiro, Bruno
    Meira, Wagner, Jr.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1266 - 1271