A novel action decision method of deep reinforcement learning based on a neural network and confidence bound

被引：0

作者：

Wenhao Zhang

Yaqing Song

Xiangpeng Liu

Qianqian Shangguan

Kang An

机构：

[1] Shanghai Normal University,The College of Information, Mechanical and Electrical Engineering

来源：

Applied Intelligence | 2023年 / 53卷

关键词：

UCB; Exploration and exploitation; Deep reinforcement learning; Machine learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

From the perspective of the deep reinforcement learning algorithm, the training effect of the agent will be affected because of the excessive randomness of the ε-greedy method. This paper proposes a novel action decision method to replace the ε-greedy method and avoid excessive randomness. First, a confidence bound span fitting model based on a deep neural network is proposed to fundamentally solve the problem that UCB cannot estimate the confidence bound span of each action in high-dimensional state space. Then, a confidence bound span balance model based on target value in reverse order is proposed. The parameters of the U network are updated after each action decision using the backpropagation of the neural network to balance the confidence bound span. Finally, an exploration-exploitation dynamic balance factor α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} is introduced to balance exploration and exploitation in the training process. Experiments are conducted using the Nature DQN and Double DQN algorithms, and the results demonstrate that the proposed method achieves higher performance than the ε-greedy method under the basic algorithm and experimental environment of this paper. The method presented in this paper has significance for applying a confidence bound to solve complex reinforcement problems.

引用

页码：21299 / 21311

页数：12

共 50 条

[1] A novel action decision method of deep reinforcement learning based on a neural network and confidence bound
Zhang, Wenhao
Song, Yaqing
Liu, Xiangpeng
Shangguan, Qianqian
An, Kang
[J]. APPLIED INTELLIGENCE, 2023, 53 (18) : 21299 - 21311
[2] A Novel Filter-Level Deep Convolutional Neural Network Pruning Method Based on Deep Reinforcement Learning
Feng, Yihao
Huang, Chao
Wang, Long
Luo, Xiong
Li, Qingwen
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
[3] Intelligent Maneuver Decision Method of UAV based on Reinforcement Learning and Neural Network
Thou, Huan
Zhang, Senyu
Sun, Chu
Ru, Changjian
[J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8544 - 8549
[4] A novel neural network based reinforcement learning
Fan, Jian
Song, Yang
Fei, MinRui
Zhao, Qijie
[J]. BIO-INSPIRED COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2007, 4688 : 46 - +
[5] Deep neural network pruning method based on sensitive layers and reinforcement learning
Yang, Wenchuan
Yu, Haoran
Cui, Baojiang
Sui, Runqi
Gu, Tianyu
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 1897 - 1917
[6] Deep neural network pruning method based on sensitive layers and reinforcement learning
Wenchuan Yang
Haoran Yu
Baojiang Cui
Runqi Sui
Tianyu Gu
[J]. Artificial Intelligence Review, 2023, 56 : 1897 - 1917
[7] A novel method of heterogeneous combat network disintegration based on deep reinforcement learning
Chen, Libin
Wang, Chen
Zeng, Chengyi
Wang, Luyao
Liu, Hongfu
Chen, Jing
[J]. FRONTIERS IN PHYSICS, 2022, 10
[8] An ensemble learning method based on deep neural network and group decision making
Zhou, Xiaojun
He, Jingyi
Yang, Chunhua
[J]. KNOWLEDGE-BASED SYSTEMS, 2022, 239
[9] Automatic Compression of Neural Network with Deep Reinforcement Learning Based on Proximal Gradient Method
Wang, Mingyi
Tang, Jianhao
Zhao, Haoli
Li, Zhenni
Xie, Shengli
[J]. MATHEMATICS, 2023, 11 (02)
[10] A Dual Deep Network Based Secure Deep Reinforcement Learning Method
Zhu, Fei
Wu, Wen
Fu, Yu-Chen
Liu, Quan
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (08): : 1812 - 1826

← 1 2 3 4 5 →