A Data-Efficient Training Method for Deep Reinforcement Learning

被引:0
|
作者
Feng, Wenhui [1 ]
Han, Chongzhao [1 ]
Lian, Feng [1 ]
Liu, Xia [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Automat Sci & Engn, Key Lab Intelligent Networks, Minist Educ, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
deep reinforcement learning; data efficiency; curriculum learning; transfer learning; LEVEL; ENVIRONMENT; GO;
D O I
10.3390/electronics11244205
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data inefficiency is one of the major challenges for deploying deep reinforcement learning algorithms widely in industry control fields, especially in regard to long-horizon sparse reward tasks. Even in a simulation-based environment, it is often prohibitive to take weeks to train an algorithm. In this study, a data-efficient training method is proposed in which a DQN is used as a base algorithm, and an elaborate curriculum is designed for the agent in the simulation scenario to accelerate the training process. In the early stage of the training process, the distribution of the initial state is set close to the goal so the agent can obtain an informative reward easily. As the training continues, the initial state distribution is set farther from the goal for the agent to explore more state space. Thus, the agent can obtain a reasonable policy through fewer interactions with the environment. To bridge the sim-to-real gap, the parameters for the output layer of the neural network for the value function are fine-tuned. An experiment on UAV maneuver control is conducted in the proposed training framework to verify the method. We demonstrate that data efficiency is different for the same data in different training stages.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Data-Efficient Deep Reinforcement Learning-Based Optimal Generation Control in DC Microgrids
    Fan, Zhen
    Zhang, Wei
    Liu, Wenxin
    [J]. IEEE SYSTEMS JOURNAL, 2024, 18 (01): : 426 - 437
  • [32] DeMis: Data-Efficient Misinformation Detection Using Reinforcement Learning
    Kawintiranon, Kornraphop
    Singh, Lisa
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT II, 2023, 13714 : 224 - 240
  • [33] Shielded Planning Guided Data-Efficient and Safe Reinforcement Learning
    Wang, Hao
    Qin, Jiahu
    Kan, Zhen
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 12
  • [34] Unsupervised Salient Patch Selection for Data-Efficient Reinforcement Learning
    Jiang, Zhaohui
    Weng, Paul
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 556 - 572
  • [35] Self-Tuning for Data-Efficient Deep Learning
    Wang, Ximei
    Gao, Jinghan
    Long, Mingsheng
    Wang, Jianmin
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7748 - 7759
  • [36] Data Based Optimal Control with Neural Networks and Data-Efficient Reinforcement Learning
    Runkler, Thomas A.
    Udluft, Steffen
    Duell, Siegmund
    [J]. AT-AUTOMATISIERUNGSTECHNIK, 2012, 60 (10) : 641 - 647
  • [37] Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments
    Bohn, Eivind
    Coates, Erlend M.
    Reinhardt, Dirk
    Johansen, Tor Arne
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3168 - 3180
  • [38] DATA-EFFICIENT MODEL-BASED REINFORCEMENT LEARNING FOR ROBOT CONTROL
    Sun, Ming
    Gao, Yue
    Liu, Wei
    Li, Shaoyuan
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2021, 36 (04): : 211 - 218
  • [39] Data-efficient model-based reinforcement learning with trajectory discrimination
    Tuo Qu
    Fuqing Duan
    Junge Zhang
    Bo Zhao
    Wenzhen Huang
    [J]. Complex & Intelligent Systems, 2024, 10 : 1927 - 1936
  • [40] Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
    Thomas, Philip S.
    Brunskill, Emma
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48