Towards Run-time Efficient Hierarchical Reinforcement Learning

被引:1
|
作者
Abramowitz, Sasha [1 ]
Nitschke, Geoff [1 ]
机构
[1] Univ Cape Town, Dept Comp Sci, Cape Town, South Africa
关键词
Hierarchical Reinforcement Learning; Evolution Strategies; Ant Gather; Ant Maze; Ant Push; LEVEL;
D O I
10.1109/CEC55065.2022.9870368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] A run-time efficient implementation of compressed pattern matching automata
    Matsumoto, Tetsuya
    Hagio, Kazuhito
    Takeda, Masayuki
    [J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2008, 5148 : 201 - 211
  • [42] A RUN-TIME EFFICIENT IMPLEMENTATION OF COMPRESSED PATTERN MATCHING AUTOMATA
    Matsumoto, Tetsuya
    Hagio, Kazuhito
    Takeda, Masayuki
    [J]. INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2009, 20 (04) : 717 - 733
  • [43] Efficient implementation of run-time generic types for Java']Java
    Allen, E
    Cartwright, R
    Stoler, B
    [J]. GENERIC PROGRAMMING, 2003, 115 : 207 - 236
  • [44] Machine learning in run-time control of multicore processor systems
    Maurer, Florian
    Thoma, Moritz
    Surhonne, Anmol Prakash
    Donyanavard, Bryan
    Herkersdorf, Andreas
    [J]. IT-INFORMATION TECHNOLOGY, 2023, 65 (4-5): : 164 - 176
  • [45] Run-Time Assurance for Learning-Based Aircraft Taxiing
    Cofer, Darren
    Amundson, Isaac
    Sattigeri, Ramachandra
    Passi, Arjun
    Boggs, Christopher
    Smith, Eric
    Gilham, Limei
    Byun, Taejoon
    Rayadurgam, Sanjai
    [J]. 2020 AIAA/IEEE 39TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC) PROCEEDINGS, 2020,
  • [46] Towards Security Case Run-time Adaptation by System Decomposition into Services
    Lisova, Elena
    Causevic, Aida
    [J]. IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 4102 - 4108
  • [47] Towards improved Bayesian fusion through run-time model analysis
    Nunnink, Jan
    Pavlin, Gregor
    [J]. 2007 PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOLS 1-4, 2007, : 986 - 993
  • [48] RUN-TIME MANAGEMENT OF LISP PARALLELISM AND THE HIERARCHICAL TASK GRAPH PROGRAM REPRESENTATION
    FURNARI, M
    POLYCHRONOPOULOS, C
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 589 : 266 - 282
  • [49] A framework for hierarchical scheduling on multiprocessors: from application requirements to run-time allocation
    Lipari, Giuseppe
    Bini, Enrico
    [J]. 31ST IEEE REAL-TIME SYSTEMS SYMPOSIUM (RTSS 2010), 2010, : 249 - 258
  • [50] Towards Sample Efficient Reinforcement Learning
    Yu, Yang
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5739 - 5743