Towards Run-time Efficient Hierarchical Reinforcement Learning

被引:1
|
作者
Abramowitz, Sasha [1 ]
Nitschke, Geoff [1 ]
机构
[1] Univ Cape Town, Dept Comp Sci, Cape Town, South Africa
关键词
Hierarchical Reinforcement Learning; Evolution Strategies; Ant Gather; Ant Maze; Ant Push; LEVEL;
D O I
10.1109/CEC55065.2022.9870368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Learning Program Behavior for Run-Time Software Assurance
    Agrawal, Hira
    Behrens, Clifford
    Dasarathy, Balakrishnan
    Fook, Leslie Lee
    [J]. COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS, 2009, 63 : 135 - +
  • [22] Run-Time Assurance for Learning-Enabled Systems
    Cofer, Darren
    Amundson, Isaac
    Sattigeri, Ramachandra
    Passi, Arjun
    Boggs, Christopher
    Smith, Eric
    Gilham, Limei
    Byun, Taejoon
    Rayadurgam, Sanjai
    [J]. NASA FORMAL METHODS (NFM 2020), 2020, 12229 : 361 - 368
  • [23] Towards an Inherently Secure Run-Time Environment for Medical Devices
    Bresch, Cyril
    Chollet, Stephanie
    Hely, David
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON INTERNET OF THINGS (ICIOT), 2018, : 140 - 147
  • [24] Run-time correction
    Grubb, WA
    [J]. OIL & GAS JOURNAL, 2004, 102 (13) : 10 - 10
  • [25] Run-time reconfiguration: Towards reducing the density requirements of FPGAs
    Brunham, K
    Kinsner, W
    [J]. CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING 2001, VOLS I AND II, CONFERENCE PROCEEDINGS, 2001, : 1259 - 1264
  • [26] RUN-TIME DEBUGGERS
    NELSON, T
    [J]. DR DOBBS JOURNAL, 1993, 18 (12): : 36 - 36
  • [27] Hierarchical run-time reconfiguration managed by an operating system for reconfigurable systems
    Nollet, V
    Mignolet, JY
    Bartic, TA
    Verkest, D
    Vernalde, S
    Lauwereins, R
    [J]. ERSA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ENGINEERING OF RECONFIGURABLE SYSTEMS AND ALGORITHMS, 2003, : 81 - 87
  • [28] Run-time verification
    Colin, S
    Mariani, L
    [J]. MODEL-BASED TESTING OF REACTIVE SYSTEMS, 2005, 3472 : 525 - 555
  • [29] EFFICIENT RUN-TIME TYPE CHECKING OF TYPED LOGIC PROGRAMS
    DART, PW
    ZOBEL, J
    [J]. JOURNAL OF LOGIC PROGRAMMING, 1992, 14 (1-2): : 31 - 69
  • [30] An efficient run-time scheme for exploiting parallelism on multiprocessor systems
    Huang, TC
    Hsu, PH
    Wu, CF
    [J]. HIGH PERFORMANCE COMPUTING - HIPC 2000, PROCEEDINGS, 2001, 1970 : 27 - 36