Mutual Deep Deterministic Policy Gradient Learning

被引:0
|
作者
Sun, Zhou [1 ]
机构
[1] HongWen Sch Qingdao, Dept Sci, Qingdao, Peoples R China
关键词
Machine Learning; Deep Reinforcement learning; DDPG; Mutual Learning;
D O I
10.1109/BDICN55575.2022.00099
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In deep reinforcement learning (DRL), policy gradient (PG) and actor-critic (AC) based methods are among the most populous and effective methods for training DRL agents. One such method is the state-of-the-art deep deterministic policy gradient (DDPG). In this research, we employ the framework of mutual learning with DDPG to present a novel, Mutual DDPG (MuDDPG) agent with the aim to improve the performance and robustness of conventional DDPG. We also propose an additional simple innovation of adaptive reward-based exploration to further improve the rate of learning. We demonstrate that by employing these schemes, MuDDPG can converge faster and perform better than vanilla DDPG in two simple simulated tasks while adding significant robustness to the learning process.
引用
下载
收藏
页码:508 / 513
页数:6
相关论文
共 50 条
  • [1] State Representation Learning for Minimax Deep Deterministic Policy Gradient
    Hu, Dapeng
    Jiang, Xuesong
    Wei, Xiumei
    Wang, Jian
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 481 - 487
  • [2] Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
    Wu, Junta
    Li, Huiyun
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020 (2020)
  • [3] Generative Adversarial Inverse Reinforcement Learning With Deep Deterministic Policy Gradient
    Zhan, Ming
    Fan, Jingjing
    Guo, Jianying
    IEEE ACCESS, 2023, 11 : 87732 - 87746
  • [4] A Deep Deterministic Policy Gradient Learning Approach to Missile Autopilot Design
    Candeli, Angelo
    de Tommasi, Gianmaria
    Lui, Dario Giuseppe
    Mele, Adriano
    Santini, Stefania
    Tartaglione, Gaetano
    IEEE ACCESS, 2022, 10 : 19685 - 19696
  • [5] Composite deep learning control for autonomous bicycles by using deep deterministic policy gradient
    He, Kanghui
    Dong, Chaoyang
    Yan, An
    Zheng, Qingyuan
    Liang, Bin
    Wang, Qing
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 2766 - 2773
  • [6] Policy Space Noise in Deep Deterministic Policy Gradient
    Yan, Yan
    Liu, Quan
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT II, 2018, 11302 : 624 - 634
  • [7] Deep Deterministic Policy Gradient for Portfolio Management
    Khemlichi, Firdaous
    Chougrad, Hiba
    Khamlichi, Youness Idrissi
    El Boushaki, Abdessamad
    Ben Ali, Safae Elhaj
    2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 424 - 429
  • [8] Strategy Generation Based on Reinforcement Learning with Deep Deterministic Policy Gradient for UCAV
    Ma, Yunhong
    Bai, Shuyao
    Zhao, Yifei
    Song, Chao
    Yang, Jie
    16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 789 - 794
  • [9] Optimal Scheduling of Microgrid Based on Deep Deterministic Policy Gradient and Transfer Learning
    Fan, Luqin
    Zhang, Jing
    He, Yu
    Liu, Ying
    Hu, Tao
    Zhang, Heng
    ENERGIES, 2021, 14 (03)
  • [10] Learning a Self-driving Bicycle Using Deep Deterministic Policy Gradient
    Le, Tuyen P.
    Quang, Nguyen Dang
    Choi, SeungYoon
    Chung, TaeChoong
    2018 18TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2018, : 231 - 236