Towards Run-time Efficient Hierarchical Reinforcement Learning

被引：1

作者：

Abramowitz, Sasha ^{[1
]}

Nitschke, Geoff ^{[1
]}

机构：

[1] Univ Cape Town, Dept Comp Sci, Cape Town, South Africa

来源：

2022 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2022年

关键词：

Hierarchical Reinforcement Learning; Evolution Strategies; Ant Gather; Ant Maze; Ant Push; LEVEL;

D O I：

10.1109/CEC55065.2022.9870368

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES's scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks.

引用

页数：8

共 50 条

[1] Towards a more efficient run-time FPGA configuration generation
Abouelella, Fatma
Bruneel, Karel
Stroobandt, Dirk
[J]. PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 624 - 631
[2] Hybrid Genetic Reinforcement Learning for Generating Run-Time Requirement Enforcers
Spieck, Jan
Sixdenier, Pierre-Louis
Esper, Khalil
Wildermann, Stefan
Teich, Juergen
[J]. 2023 21ST ACM-IEEE INTERNATIONAL SYMPOSIUM ON FORMAL METHODS AND MODELS FOR SYSTEM DESIGN, MEMOCODE, 2023, : 23 - 35
[3] RLConfig: Run-time Configuration of Cluster Schedulers via Deep Reinforcement Learning
Wei, Xiaohui
Zhou, Changbao
Sheng, Yong
Wu, Yan
Li, Lina
Gao, Shang
[J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 92 - 99
[4] Towards Independent Run-time Cloud Monitoring
Klaver, Luuk
van der Knaap, Thijs
van der Geest, Johan
Harmsma, Edwin
van der Waaij, Bram
Pileggi, Paolo
[J]. COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 21 - 26
[5] Deep Reinforcement Learning for Automatic Run-Time Adaptation of UWB PHY Radio Settings
Coppens, Dieter
Shahid, Adnan
De Poorter, Eli
[J]. IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (01) : 64 - 79
[6] Efficient incremental run-time specialization for free
Marlet, R
Consel, C
Boinot, P
[J]. ACM SIGPLAN NOTICES, 1999, 34 (05) : 281 - 292
[7] Efficient run-time monitoring of timing constraints
Mok, AK
Liu, GT
[J]. THIRD IEEE REAL-TIME TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 252 - 262
[8] Run-Time Efficient Probabilistic Model Checking
Filieri, Antonio
Ghezzi, Carlo
Tamburrelli, Giordano
[J]. 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 341 - 350
[9] Efficient run-time parallelization for DO loops
Yang, CT
Tseng, SS
Hsieh, MH
Kao, SH
[J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 1998, 14 (01) : 237 - 253
[10] Learning Run-time Compositions of Interacting Adaptations
Cardozo, Nicolas
Dusparic, Ivana
[J]. 2020 IEEE/ACM 15TH INTERNATIONAL SYMPOSIUM ON SOFTWARE ENGINEERING FOR ADAPTIVE AND SELF-MANAGING SYSTEMS, SEAMS, 2020, : 108 - 114

← 1 2 3 4 5 →