Clustering experience replay for the effective exploitation in reinforcement learning

被引:13
|
作者
Li, Min [1 ]
Huang, Tianyi [1 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Peoples R China
关键词
Reinforcement learning; Clustering; Experience replay; Exploitation efficiency; Time division;
D O I
10.1016/j.patcog.2022.108875
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning is a useful tool for training an agent to effectively achieve the desired goal in the sequential decision-making problem. It trains the agent to make decision by exploiting the experience in the transitions resulting from the different decisions. To exploit this experience, most reinforcement learning methods replay the explored transitions by uniform sampling. But in this way, it is easy to ig-nore the last explored transitions. Another way to exploit this experience defines the priority of each transition by the estimation error in training and then replays the transitions according to their priori-ties. But it only updates the priorities of the transitions replayed at the current training time step, thus the transitions with low priorities will be ignored. In this paper, we propose a clustering experience re-play, called CER, to effectively exploit the experience hidden in all explored transitions in the current training. CER clusters and replays the transitions by a divide-and-conquer framework based on time di-vision as follows. Firstly, it divides the whole training process into several periods. Secondly, at the end of each period, it uses k-means to cluster the transitions explored in this period. Finally, it constructs a conditional probability density function to ensure that all kinds of transitions will be sufficiently replayed in the current training. We construct a new method, TD3 _ CER, to implement our clustering experience replay on TD3. Through the theoretical analysis and experiments, we illustrate that our TD3 _ CER is more effective than the existing reinforcement learning methods. The source code can be downloaded from https://github.com/grcai/CER-Master .(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    [J]. NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [2] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
    Thakoor, Ninad
    Bhanu, Bir
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
  • [3] Efficient experience replay architecture for offline reinforcement learning
    Zhang, Longfei
    Feng, Yanghe
    Wang, Rongxiao
    Xu, Yue
    Xu, Naifu
    Liu, Zeyi
    Du, Hang
    [J]. ROBOTIC INTELLIGENCE AND AUTOMATION, 2023, 43 (01): : 35 - 43
  • [4] Deep Reinforcement Learning with Experience Replay Based on SARSA
    Zhao, Dongbin
    Wang, Haitao
    Shao, Kun
    Zhu, Yuanheng
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [5] Experience Replay for Real-Time Reinforcement Learning Control
    Adam, Sander
    Busoniu, Lucian
    Babuska, Robert
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02): : 201 - 212
  • [6] A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay
    Li Menglin
    Chen Jing
    Chen Shaofei
    Gao Wei
    [J]. PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1994 - 2001
  • [7] Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
    Wawrzynski, Pawel
    [J]. PROCEEDINGS OF THE INTERNATIONAL NEURAL NETWORK SOCIETY WINTER CONFERENCE (INNS-WC2012), 2012, 13 : 205 - 211
  • [8] An Experience Replay Method Based on Tree Structure for Reinforcement Learning
    Jiang, Wei-Cheng
    Hwang, Kao-Shing
    Lin, Jin-Ling
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 972 - 982
  • [9] Deep Reinforcement Learning With Quantum-Inspired Experience Replay
    Wei, Qing
    Ma, Hailan
    Chen, Chunlin
    Dong, Daoyi
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 9326 - 9338
  • [10] Associative Memory Based Experience Replay for Deep Reinforcement Learning
    Li, Mengyuan
    Kazemi, Arman
    Laguna, Ann Franchesca
    Hu, X. Sharon
    [J]. 2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,