A Novel Adaptive Sampling Strategy for Deep Reinforcement Learning

被引:1
|
作者
Liang, Xingxing [1 ]
Chen, Li [1 ]
Feng, Yanghe [1 ]
Liu, Zhong [1 ]
Ma, Yang [1 ]
Huang, Kuihua [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China
关键词
Deep reinforcement learning; an adaptive factor; DQN; Actor-Critic (AC) algorithm; GAME; GO;
D O I
10.1142/S1469026821500115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning, as an effective method to solve complex sequential decision-making problems, plays an important role in areas such as intelligent decision-making and behavioral cognition. It is well known that the sample experience replay mechanism contributes to the development of current deep reinforcement learning by reusing past samples to improve the efficiency of samples. However, the existing priority experience replay mechanism changes the sample distribution in the sample set due to the higher sampling frequency assigned to a specific transition, and it cannot be applied to actor-critic and other on-policy reinforcement learning algorithm. To address this, we propose an adaptive factor based on TD-error, which further increases sample utilization by giving more attention weight to samples of larger TD-error, and embeds it flexibly into the original Deep Q Network and Advantage Actor-Critic algorithm to improve their performance. Then we carried out the performance evaluation for the proposed architecture in the context of CartPole-V1 and 6 environments of Atari game experiments, respectively, and the obtained results either on the conditions of fixed temperature or annealing temperature, when compared to those produced by the vanilla DQN and original A2C, highlight the advantages in cumulative rewards and climb speed of the improved algorithms.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Collaborative Optimization of Energy Management Strategy and Adaptive Cruise Control Based on Deep Reinforcement Learning
    Peng, Jiankun
    Fan, Yi
    Yin, Guodong
    Jiang, Ruhai
    [J]. IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION, 2023, 9 (01) : 34 - 44
  • [32] Learning a Diagnostic Strategy on Medical Data With Deep Reinforcement Learning
    Zhu, Mengxiao
    Zhu, Haogang
    [J]. IEEE ACCESS, 2021, 9 : 84122 - 84133
  • [33] A Penetration Strategy Combining Deep Reinforcement Learning and Imitation Learning
    Wang, Xiaofang
    Gu, Kunren
    [J]. Yuhang Xuebao/Journal of Astronautics, 2023, 44 (06): : 914 - 925
  • [34] A Novel Trading Strategy Framework Based on Reinforcement Deep Learning for Financial Market Predictions
    Cheng, Li-Chen
    Huang, Yu-Hsiang
    Hsieh, Ming-Hua
    Wu, Mu-En
    [J]. MATHEMATICS, 2021, 9 (23)
  • [35] Adaptive evolution strategy with ensemble of mutations for Reinforcement Learning
    Ajani, Oladayo S.
    Mallipeddi, Rammohan
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 245
  • [36] Adaptive evolution strategy with ensemble of mutations for Reinforcement Learning
    Ajani, Oladayo S.
    Mallipeddi, Rammohan
    [J]. Knowledge-Based Systems, 2022, 245
  • [37] Variable Sampling Period Adaptive Control Based on Reinforcement Learning
    Lemos, Joao M.
    Parente, Francisco
    Cunha, Rita
    [J]. CONTROLO 2022, 2022, 930 : 577 - 586
  • [38] MARLAS: Multi Agent Reinforcement Learning for Cooperated Adaptive Sampling
    Pan, Lishuo
    Manjanna, Sandeep
    Hsieh, M. Ani
    [J]. DISTRIBUTED AUTONOMOUS ROBOTIC SYSTEMS, DARS 2022, 2024, 28 : 347 - 362
  • [39] Optimistic Sampling Strategy for Data-Efficient Reinforcement Learning
    Zhao, Dongfang
    Liu, Jiafeng
    Wu, Rui
    Cheng, Dansong
    Tang, Xianglong
    [J]. IEEE ACCESS, 2019, 7 : 55763 - 55769
  • [40] Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization
    Li, Xiaodong
    Wu, Pangjing
    Zou, Chenxin
    Li, Qing
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (03) : 288 - 300