A Novel Adaptive Sampling Strategy for Deep Reinforcement Learning

被引：1

作者：

Liang, Xingxing ^{[1
]}

Chen, Li ^{[1
]}

Feng, Yanghe ^{[1
]}

Liu, Zhong ^{[1
]}

Ma, Yang ^{[1
]}

Huang, Kuihua ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Syst Engn, Changsha, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS | 2021年 / 20卷 / 02期

关键词：

Deep reinforcement learning; an adaptive factor; DQN; Actor-Critic (AC) algorithm; GAME; GO;

D O I：

10.1142/S1469026821500115

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning, as an effective method to solve complex sequential decision-making problems, plays an important role in areas such as intelligent decision-making and behavioral cognition. It is well known that the sample experience replay mechanism contributes to the development of current deep reinforcement learning by reusing past samples to improve the efficiency of samples. However, the existing priority experience replay mechanism changes the sample distribution in the sample set due to the higher sampling frequency assigned to a specific transition, and it cannot be applied to actor-critic and other on-policy reinforcement learning algorithm. To address this, we propose an adaptive factor based on TD-error, which further increases sample utilization by giving more attention weight to samples of larger TD-error, and embeds it flexibly into the original Deep Q Network and Advantage Actor-Critic algorithm to improve their performance. Then we carried out the performance evaluation for the proposed architecture in the context of CartPole-V1 and 6 environments of Atari game experiments, respectively, and the obtained results either on the conditions of fixed temperature or annealing temperature, when compared to those produced by the vanilla DQN and original A2C, highlight the advantages in cumulative rewards and climb speed of the improved algorithms.

引用

页数：20

共 50 条

[41] A Precision Advertising Strategy Based on Deep Reinforcement Learning
Liang, Haiqing
[J]. Ingenierie des Systemes d'Information, 2020, 25 (03): : 397 - 403
[42] A Human Mixed Strategy Approach to Deep Reinforcement Learning
Ngoc Duy Nguyen
Nahavandi, Saeid
Thanh Nguyen
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 4023 - 4028
[43] A Stock Trading Strategy Based on Deep Reinforcement Learning
Khemlichi, Firdaous
Chougrad, Hiba
Khamlichi, Youness Idrissi
El Boushaki, Abdessamad
Ben Ali, Safae El Haj
[J]. ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 2, 2022, 1418 : 920 - 928
[44] Deep reinforcement learning for acceptance strategy in bilateral negotiations
Razeghi, Yousef
Yavuz, Ozan
Aydogan, Reyhan
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2020, 28 (04) : 1824 - 1840
[45] Accelerating deep reinforcement learning model for game strategy
Li, Yifan
Fang, Yuchun
Akhtar, Zahid
[J]. NEUROCOMPUTING, 2020, 408 : 157 - 168
[46] Adaptive Eligibility Traces for Online Deep Reinforcement Learning
Kobayashi, Taisuke
[J]. INTELLIGENT AUTONOMOUS SYSTEMS 16, IAS-16, 2022, 412 : 417 - 428
[47] Research on Constant Perturbation Strategy for Deep Reinforcement Learning
Shen, Jiamin
Xu, Li
Wan, Xu
Chai, Jixuan
Fan, Chunlong
[J]. 2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 526 - 533
[48] Distributed and Adaptive Traffic Engineering with Deep Reinforcement Learning
Geng, Nan
Xu, Mingwei
Yang, Yuan
Liu, Chenyi
Yang, Jiahai
Li, Qi
Zhang, Shize
[J]. 2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
[49] Adaptive DAG Tasks Scheduling with Deep Reinforcement Learning
Wu, Qing
Wu, Zhiwei
Zhuang, Yuehui
Cheng, Yuxia
[J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT II, 2018, 11335 : 477 - 490
[50] Deep Reinforcement Learning with Adaptive Update Target Combination
Xu, Z.
Cao, L.
Chen, X.
[J]. COMPUTER JOURNAL, 2020, 63 (07): : 995 - 1003

← 1 2 3 4 5 →