Towards Run-time Efficient Hierarchical Reinforcement Learning

被引:1
|
作者
Abramowitz, Sasha [1 ]
Nitschke, Geoff [1 ]
机构
[1] Univ Cape Town, Dept Comp Sci, Cape Town, South Africa
关键词
Hierarchical Reinforcement Learning; Evolution Strategies; Ant Gather; Ant Maze; Ant Push; LEVEL;
D O I
10.1109/CEC55065.2022.9870368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Efficient run-time scheduling for parallelizing partially parallel loop
    Huang, TC
    Hsu, PH
    Sheng, TN
    [J]. ICA(3)PP 97 - 1997 3RD INTERNATIONAL CONFERENCE ON ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, 1997, : 397 - 403
  • [32] Effective and Efficient Compilation of Run-Time Generics in Java']Java
    Viroli, Mirko
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2005, 138 (02) : 95 - 116
  • [33] Reliable Power Efficient Systems through Run-time Reconfiguration
    El-Araby, Nahla
    Jantsch, Axel
    [J]. 2022 20TH IEEE INTERREGIONAL NEWCAS CONFERENCE (NEWCAS), 2022, : 347 - 351
  • [34] MalAware: Effective and Efficient Run-time Mobile Malware Detector
    Milosevic, Jelena
    Ferrante, Alberto
    Malek, Miroslaw
    [J]. 2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 270 - 277
  • [35] Efficient run-time scheduling for parallelizing partially parallel loops
    Huang, TC
    Hsu, PH
    Sheng, TN
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 1998, 14 (01) : 255 - 264
  • [36] Automated caching of behavioral patterns for efficient run-time monitoring
    Stakhanova, Natalia
    Basu, Samik
    Lutz, Robyn R.
    Wong, Johnny
    [J]. DASC 2006: 2ND IEEE INTERNATIONAL SYMPOSIUM ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, PROCEEDINGS, 2006, : 333 - +
  • [37] POSTER: Leveraging Run-Time Feedback for Efficient ASR Acceleration
    Yazdani, Reza
    Arnau, Jose-Maria
    Gonzalez, Antonio
    [J]. 2019 28TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2019), 2019, : 462 - 463
  • [38] A run-time efficient implementation of compressed pattern matching automata
    Matsumoto, Tetsuya
    Hagio, Kazuhito
    Takeda, Masayuki
    [J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2008, 5148 : 201 - 211
  • [39] Efficient implementation of run-time generic types for Java']Java
    Allen, E
    Cartwright, R
    Stoler, B
    [J]. GENERIC PROGRAMMING, 2003, 115 : 207 - 236
  • [40] A RUN-TIME EFFICIENT IMPLEMENTATION OF COMPRESSED PATTERN MATCHING AUTOMATA
    Matsumoto, Tetsuya
    Hagio, Kazuhito
    Takeda, Masayuki
    [J]. INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2009, 20 (04) : 717 - 733