A novel action decision method of deep reinforcement learning based on a neural network and confidence bound

被引:0
|
作者
Wenhao Zhang
Yaqing Song
Xiangpeng Liu
Qianqian Shangguan
Kang An
机构
[1] Shanghai Normal University,The College of Information, Mechanical and Electrical Engineering
来源
Applied Intelligence | 2023年 / 53卷
关键词
UCB; Exploration and exploitation; Deep reinforcement learning; Machine learning;
D O I
暂无
中图分类号
学科分类号
摘要
From the perspective of the deep reinforcement learning algorithm, the training effect of the agent will be affected because of the excessive randomness of the ε-greedy method. This paper proposes a novel action decision method to replace the ε-greedy method and avoid excessive randomness. First, a confidence bound span fitting model based on a deep neural network is proposed to fundamentally solve the problem that UCB cannot estimate the confidence bound span of each action in high-dimensional state space. Then, a confidence bound span balance model based on target value in reverse order is proposed. The parameters of the U network are updated after each action decision using the backpropagation of the neural network to balance the confidence bound span. Finally, an exploration-exploitation dynamic balance factor α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document} is introduced to balance exploration and exploitation in the training process. Experiments are conducted using the Nature DQN and Double DQN algorithms, and the results demonstrate that the proposed method achieves higher performance than the ε-greedy method under the basic algorithm and experimental environment of this paper. The method presented in this paper has significance for applying a confidence bound to solve complex reinforcement problems.
引用
收藏
页码:21299 / 21311
页数:12
相关论文
共 50 条
  • [1] A novel action decision method of deep reinforcement learning based on a neural network and confidence bound
    Zhang, Wenhao
    Song, Yaqing
    Liu, Xiangpeng
    Shangguan, Qianqian
    An, Kang
    [J]. APPLIED INTELLIGENCE, 2023, 53 (18) : 21299 - 21311
  • [2] A Novel Filter-Level Deep Convolutional Neural Network Pruning Method Based on Deep Reinforcement Learning
    Feng, Yihao
    Huang, Chao
    Wang, Long
    Luo, Xiong
    Li, Qingwen
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [3] Intelligent Maneuver Decision Method of UAV based on Reinforcement Learning and Neural Network
    Thou, Huan
    Zhang, Senyu
    Sun, Chu
    Ru, Changjian
    [J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8544 - 8549
  • [4] A novel neural network based reinforcement learning
    Fan, Jian
    Song, Yang
    Fei, MinRui
    Zhao, Qijie
    [J]. BIO-INSPIRED COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2007, 4688 : 46 - +
  • [5] Deep neural network pruning method based on sensitive layers and reinforcement learning
    Yang, Wenchuan
    Yu, Haoran
    Cui, Baojiang
    Sui, Runqi
    Gu, Tianyu
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 1897 - 1917
  • [6] Deep neural network pruning method based on sensitive layers and reinforcement learning
    Wenchuan Yang
    Haoran Yu
    Baojiang Cui
    Runqi Sui
    Tianyu Gu
    [J]. Artificial Intelligence Review, 2023, 56 : 1897 - 1917
  • [7] A novel method of heterogeneous combat network disintegration based on deep reinforcement learning
    Chen, Libin
    Wang, Chen
    Zeng, Chengyi
    Wang, Luyao
    Liu, Hongfu
    Chen, Jing
    [J]. FRONTIERS IN PHYSICS, 2022, 10
  • [8] An ensemble learning method based on deep neural network and group decision making
    Zhou, Xiaojun
    He, Jingyi
    Yang, Chunhua
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 239
  • [9] Automatic Compression of Neural Network with Deep Reinforcement Learning Based on Proximal Gradient Method
    Wang, Mingyi
    Tang, Jianhao
    Zhao, Haoli
    Li, Zhenni
    Xie, Shengli
    [J]. MATHEMATICS, 2023, 11 (02)
  • [10] A Dual Deep Network Based Secure Deep Reinforcement Learning Method
    Zhu, Fei
    Wu, Wen
    Fu, Yu-Chen
    Liu, Quan
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (08): : 1812 - 1826