Memetic Evolution Strategy for Reinforcement Learning

被引:0
|
作者
Qu, Xinghua [1 ]
Ong, Yew-Soon [1 ]
Hou, Yaqing [2 ]
Shen, Xiaobo [3 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian, Peoples R China
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China
关键词
reinforcement learning; memetic algorithm; evolution strategy; Q learning; NEUROEVOLUTION; GAME; GO;
D O I
10.1109/cec.2019.8789935
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Neuroevolution (i.e., training neural network with Evolution Computation) has successfully unfolded a range of challenging reinforcement learning (RL) tasks. However, existing neuroevolution methods suffer from high sample complexity, as the black-box evaluations (i.e., accumulated rewards of complete Markov Decision Processes (MDPs)) discard bunches of temporal frames (i.e., time-step data instances in MDP). Actually, these temporal frames hold the Markov property of the problem, that benefits the training of neural network as well by temporal difference (TD) learning. In this paper, we propose a memetic reinforcement learning (MRL) framework that optimizes the RL agent by leveraging both black-box evaluations and temporal frames. To this end, an evolution strategy (ES) is associated with Q learning, where ES provides diversified frames to globally train the agent, and Q learning locally exploits the Markov property within frames to refresh the agent. Therefore, MRL conveys a novel memetic framework that allows evaluation free local search by Q learning. Experiments on classical control problem verify the efficiency of the proposed MRL, that achieves significantly faster convergence than canonical ES.
引用
收藏
页码:1922 / 1928
页数:7
相关论文
共 50 条
  • [21] Guest Editorial: Special issue on memetic algorithms with learning strategy
    Ling Wang
    Liang Feng
    [J]. Memetic Computing, 2021, 13 : 147 - 148
  • [22] Guest Editorial: Special issue on memetic algorithms with learning strategy
    Wang, Ling
    Feng, Liang
    [J]. MEMETIC COMPUTING, 2021, 13 (02) : 147 - 148
  • [23] Bidding strategy evolution analysis based on multi-task inverse reinforcement learning
    Tang, Qinghu
    Guo, Hongye
    Chen, Qixin
    [J]. Electric Power Systems Research, 2022, 212
  • [24] Differential evolution based on strategy adaptation and deep reinforcement learning for multimodal optimization problems
    Liao, Zuowen
    Pang, Qishuo
    Gu, Qiong
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2024, 87
  • [25] Bidding strategy evolution analysis based on multi-task inverse reinforcement learning
    Tang, Qinghu
    Guo, Hongye
    Chen, Qixin
    [J]. ELECTRIC POWER SYSTEMS RESEARCH, 2022, 212
  • [26] A multilevel sampling strategy based memetic differential evolution for multimodal optimization
    Wang, Xi
    Sheng, Mengmeng
    Ye, Kangfei
    Lin, Jian
    Mao, Jiafa
    Chen, Shengyong
    Sheng, Weiguo
    [J]. NEUROCOMPUTING, 2019, 334 : 79 - 88
  • [27] A memetic-clustering-based evolution strategy for traveling salesman problems
    Wang, Yuping
    Qin, Jinhua
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2007, 4481 : 260 - +
  • [28] Deep reinforcement learning assisted memetic scheduling of drones for railway catenary deicing
    Zheng, Yu-Jun
    Xie, Xi-Cheng
    Zhang, Zhi-Yuan
    Shi, Jin-Tang
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2024, 91
  • [29] Distributed memetic differential evolution with the synergy of Lamarckian and Baldwinian learning
    Zhang, Chunmei
    Chen, Jie
    Xin, Bin
    [J]. APPLIED SOFT COMPUTING, 2013, 13 (05) : 2947 - 2959
  • [30] Inverse Reinforcement Learning for Strategy Identification
    Rucker, Mark
    Adams, Stephen
    Hayes, Roy
    Beling, Peter A.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 3067 - 3074