Memetic Evolution Strategy for Reinforcement Learning

被引:0
|
作者
Qu, Xinghua [1 ]
Ong, Yew-Soon [1 ]
Hou, Yaqing [2 ]
Shen, Xiaobo [3 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China
关键词
reinforcement learning; memetic algorithm; evolution strategy; Q learning; NEUROEVOLUTION; GAME; GO;
D O I
10.1109/cec.2019.8789935
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Neuroevolution (i.e., training neural network with Evolution Computation) has successfully unfolded a range of challenging reinforcement learning (RL) tasks. However, existing neuroevolution methods suffer from high sample complexity, as the black-box evaluations (i.e., accumulated rewards of complete Markov Decision Processes (MDPs)) discard bunches of temporal frames (i.e., time-step data instances in MDP). Actually, these temporal frames hold the Markov property of the problem, that benefits the training of neural network as well by temporal difference (TD) learning. In this paper, we propose a memetic reinforcement learning (MRL) framework that optimizes the RL agent by leveraging both black-box evaluations and temporal frames. To this end, an evolution strategy (ES) is associated with Q learning, where ES provides diversified frames to globally train the agent, and Q learning locally exploits the Markov property within frames to refresh the agent. Therefore, MRL conveys a novel memetic framework that allows evaluation free local search by Q learning. Experiments on classical control problem verify the efficiency of the proposed MRL, that achieves significantly faster convergence than canonical ES.
引用
收藏
页码:1922 / 1928
页数:7
相关论文
共 50 条
  • [1] Adaptive evolution strategy with ensemble of mutations for Reinforcement Learning
    Ajani, Oladayo S.
    Mallipeddi, Rammohan
    [J]. Knowledge-Based Systems, 2022, 245
  • [2] Adaptive evolution strategy with ensemble of mutations for Reinforcement Learning
    Ajani, Oladayo S.
    Mallipeddi, Rammohan
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 245
  • [3] A Memetic Algorithm With Reinforcement Learning for Sociotechnical Production Scheduling
    Grumbach, Felix
    Badr, Nour Eldin Alaa
    Reusch, Pascal
    Trojahn, Sebastian
    [J]. IEEE ACCESS, 2023, 11 : 68760 - 68775
  • [4] Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning
    Tonghao Wang
    Xingguang Peng
    Yaochu Jin
    Demin Xu
    [J]. Memetic Computing, 2022, 14 : 3 - 17
  • [5] Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning
    Wang, Tonghao
    Peng, Xingguang
    Jin, Yaochu
    Xu, Demin
    [J]. MEMETIC COMPUTING, 2022, 14 (01) : 3 - 17
  • [6] Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning
    Fan, Litong
    Yu, Dengxiu
    Cheong, Kang Hao
    Wang, Zhen
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [7] Reinforcement learning strategy for control of microstructure evolution in phase field models
    Yang, Haiying
    Demkowicz, Michael J.
    [J]. COMPUTATIONAL MATERIALS SCIENCE, 2024, 231
  • [8] ACCELERATING REINFORCEMENT LEARNING WITH A DIRECTIONAL-GAUSSIAN-SMOOTHING EVOLUTION STRATEGY
    Zhang, Jiaxin
    Tran, Hoang
    Zhang, Guannan
    [J]. ELECTRONIC RESEARCH ARCHIVE, 2021, 29 (06): : 4119 - 4135
  • [9] Differential evolution with mixed mutation strategy based on deep reinforcement learning
    Tan, Zhiping
    Li, Kangshun
    [J]. APPLIED SOFT COMPUTING, 2021, 111
  • [10] A reinforcement learning-based strategy updating model for the cooperative evolution
    Wang, Xianjia
    Yang, Zhipeng
    Liu, Yanli
    Chen, Guici
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2023, 618