Towards Run-time Efficient Hierarchical Reinforcement Learning

被引:1
|
作者
Abramowitz, Sasha [1 ]
Nitschke, Geoff [1 ]
机构
[1] Univ Cape Town, Dept Comp Sci, Cape Town, South Africa
关键词
Hierarchical Reinforcement Learning; Evolution Strategies; Ant Gather; Ant Maze; Ant Push; LEVEL;
D O I
10.1109/CEC55065.2022.9870368
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Towards a more efficient run-time FPGA configuration generation
    Abouelella, Fatma
    Bruneel, Karel
    Stroobandt, Dirk
    [J]. PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 624 - 631
  • [2] Hybrid Genetic Reinforcement Learning for Generating Run-Time Requirement Enforcers
    Spieck, Jan
    Sixdenier, Pierre-Louis
    Esper, Khalil
    Wildermann, Stefan
    Teich, Juergen
    [J]. 2023 21ST ACM-IEEE INTERNATIONAL SYMPOSIUM ON FORMAL METHODS AND MODELS FOR SYSTEM DESIGN, MEMOCODE, 2023, : 23 - 35
  • [3] RLConfig: Run-time Configuration of Cluster Schedulers via Deep Reinforcement Learning
    Wei, Xiaohui
    Zhou, Changbao
    Sheng, Yong
    Wu, Yan
    Li, Lina
    Gao, Shang
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 92 - 99
  • [4] Towards Independent Run-time Cloud Monitoring
    Klaver, Luuk
    van der Knaap, Thijs
    van der Geest, Johan
    Harmsma, Edwin
    van der Waaij, Bram
    Pileggi, Paolo
    [J]. COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 21 - 26
  • [5] Deep Reinforcement Learning for Automatic Run-Time Adaptation of UWB PHY Radio Settings
    Coppens, Dieter
    Shahid, Adnan
    De Poorter, Eli
    [J]. IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (01) : 64 - 79
  • [6] Efficient incremental run-time specialization for free
    Marlet, R
    Consel, C
    Boinot, P
    [J]. ACM SIGPLAN NOTICES, 1999, 34 (05) : 281 - 292
  • [7] Efficient run-time monitoring of timing constraints
    Mok, AK
    Liu, GT
    [J]. THIRD IEEE REAL-TIME TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 252 - 262
  • [8] Run-Time Efficient Probabilistic Model Checking
    Filieri, Antonio
    Ghezzi, Carlo
    Tamburrelli, Giordano
    [J]. 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 341 - 350
  • [9] Efficient run-time parallelization for DO loops
    Yang, CT
    Tseng, SS
    Hsieh, MH
    Kao, SH
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 1998, 14 (01) : 237 - 253
  • [10] Learning Run-time Compositions of Interacting Adaptations
    Cardozo, Nicolas
    Dusparic, Ivana
    [J]. 2020 IEEE/ACM 15TH INTERNATIONAL SYMPOSIUM ON SOFTWARE ENGINEERING FOR ADAPTIVE AND SELF-MANAGING SYSTEMS, SEAMS, 2020, : 108 - 114