Adaptive Evolutionary Reinforcement Learning with Policy Direction

被引:0
|
作者
Dong, Caibo [1 ]
Li, Dazi [1 ]
机构
[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive evolutionary reinforcement learning; Adaptive evolutionary soft actor-critic; Soft actor-critic; Policy direction; COMPUTATION;
D O I
10.1007/s11063-024-11548-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Evolutionary Reinforcement Learning (ERL) has garnered widespread attention in recent years due to its inherent robustness and parallelism. However, the integration of Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) remains relatively rudimentary and lacks dynamism, which can impact the convergence performance of ERL algorithms. In this study, a dynamic adaptive module is introduced to balance the Evolution Strategies (ES) and RL training within ERL. By incorporating elite strategies, this module leverages advantageous individuals to elevate the overall population's performance. Additionally, RL strategy updates often lack guidance from the population. To address this, we incorporate the strategies of the best individuals from the population, providing valuable policy direction. This is achieved through the formulation of a loss function that employs either L1 or L2 regularization to facilitate RL training. The proposed framework is referred to as Adaptive Evolutionary Reinforcement Learning (AERL). The effectiveness of our framework is evaluated by adopting Soft Actor-Critic (SAC) as the RL algorithm and comparing it with other algorithms in the MuJoCo environment. The results underscore the outstanding convergence performance of our proposed Adaptive Evolutionary Soft Actor-Critic (AESAC) algorithm. Furthermore, ablation experiments are conducted to emphasize the necessity of these two improvements. It is worth noting that the enhancements in AESAC are realized at the population level, enabling broader exploration and effectively reducing the risk of falling into local optima.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Adaptive Evolutionary Reinforcement Learning with Policy Direction
    Caibo Dong
    Dazi Li
    [J]. Neural Processing Letters, 56
  • [2] Diversity Evolutionary Policy Deep Reinforcement Learning
    Liu, Jian
    Feng, Liming
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [3] Adaptive evolutionary programming based on reinforcement learning
    Zhang, Huaxiang
    Lu, Jing
    [J]. INFORMATION SCIENCES, 2008, 178 (04) : 971 - 984
  • [4] Adaptive Optimization in Evolutionary Reinforcement Learning Using Evolutionary Mutation Rates
    Zhao, Y.
    Ding, Y.
    Pei, Y.
    [J]. IEEE Access, 2024, 12 : 165384 - 165394
  • [5] Adaptive Natural Policy Gradient in Reinforcement Learning
    Li, Dazi
    Qiao, Zengyuan
    Song, Tianheng
    Jin, Qibing
    [J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
  • [6] Evolutionary adaptive-critic methods for reinforcement learning
    Xu, X
    He, HG
    Hu, DW
    [J]. CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1320 - 1325
  • [7] Adaptive Multifactorial Evolutionary Optimization for Multitask Reinforcement Learning
    Martinez, Aritz D.
    Del Ser, Javier
    Osaba, Eneko
    Herrera, Francisco
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2022, 26 (02) : 233 - 247
  • [8] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
    Zheng, Han
    Luo, Xufang
    Wei, Pengfei
    Song, Xuan
    Li, Dongsheng
    Jiang, Jing
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380
  • [9] Reinforcement Learning with Evolutionary Computation to Policy Search for Autonomous Navigation
    Zhang, Chengsi
    Dong, Lu
    Sun, Changyin
    [J]. 2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 288 - 292
  • [10] An adaptive evolutionary-reinforcement learning algorithm for band selection
    Wang, Mingwei
    Zhang, Haoming
    Yin, Biyu
    Chen, Maolin
    Liu, Wei
    Ye, Zhiwei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251