Adaptive Evolutionary Reinforcement Learning with Policy Direction

被引：0

作者：

Dong, Caibo ^{[1
]}

Li, Dazi ^{[1
]}

机构：

[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China

来源：

NEURAL PROCESSING LETTERS | 2024年 / 56卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Adaptive evolutionary reinforcement learning; Adaptive evolutionary soft actor-critic; Soft actor-critic; Policy direction; COMPUTATION;

D O I：

10.1007/s11063-024-11548-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Evolutionary Reinforcement Learning (ERL) has garnered widespread attention in recent years due to its inherent robustness and parallelism. However, the integration of Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) remains relatively rudimentary and lacks dynamism, which can impact the convergence performance of ERL algorithms. In this study, a dynamic adaptive module is introduced to balance the Evolution Strategies (ES) and RL training within ERL. By incorporating elite strategies, this module leverages advantageous individuals to elevate the overall population's performance. Additionally, RL strategy updates often lack guidance from the population. To address this, we incorporate the strategies of the best individuals from the population, providing valuable policy direction. This is achieved through the formulation of a loss function that employs either L1 or L2 regularization to facilitate RL training. The proposed framework is referred to as Adaptive Evolutionary Reinforcement Learning (AERL). The effectiveness of our framework is evaluated by adopting Soft Actor-Critic (SAC) as the RL algorithm and comparing it with other algorithms in the MuJoCo environment. The results underscore the outstanding convergence performance of our proposed Adaptive Evolutionary Soft Actor-Critic (AESAC) algorithm. Furthermore, ablation experiments are conducted to emphasize the necessity of these two improvements. It is worth noting that the enhancements in AESAC are realized at the population level, enabling broader exploration and effectively reducing the risk of falling into local optima.

引用

页数：19

共 50 条

[1] Adaptive Evolutionary Reinforcement Learning with Policy Direction
Caibo Dong
Dazi Li
[J]. Neural Processing Letters, 56
[2] Diversity Evolutionary Policy Deep Reinforcement Learning
Liu, Jian
Feng, Liming
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[3] Adaptive evolutionary programming based on reinforcement learning
Zhang, Huaxiang
Lu, Jing
[J]. INFORMATION SCIENCES, 2008, 178 (04) : 971 - 984
[4] Adaptive Optimization in Evolutionary Reinforcement Learning Using Evolutionary Mutation Rates
Zhao, Y.
Ding, Y.
Pei, Y.
[J]. IEEE Access, 2024, 12 : 165384 - 165394
[5] Adaptive Natural Policy Gradient in Reinforcement Learning
Li, Dazi
Qiao, Zengyuan
Song, Tianheng
Jin, Qibing
[J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
[6] Evolutionary adaptive-critic methods for reinforcement learning
Xu, X
He, HG
Hu, DW
[J]. CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1320 - 1325
[7] Adaptive Multifactorial Evolutionary Optimization for Multitask Reinforcement Learning
Martinez, Aritz D.
Del Ser, Javier
Osaba, Eneko
Herrera, Francisco
[J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2022, 26 (02) : 233 - 247
[8] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Zheng, Han
Luo, Xufang
Wei, Pengfei
Song, Xuan
Li, Dongsheng
Jiang, Jing
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380
[9] Reinforcement Learning with Evolutionary Computation to Policy Search for Autonomous Navigation
Zhang, Chengsi
Dong, Lu
Sun, Changyin
[J]. 2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 288 - 292
[10] An adaptive evolutionary-reinforcement learning algorithm for band selection
Wang, Mingwei
Zhang, Haoming
Yin, Biyu
Chen, Maolin
Liu, Wei
Ye, Zhiwei
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251

← 1 2 3 4 5 →