A Data-Efficient Training Method for Deep Reinforcement Learning

被引:0
|
作者
Feng, Wenhui [1 ]
Han, Chongzhao [1 ]
Lian, Feng [1 ]
Liu, Xia [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Automat Sci & Engn, Key Lab Intelligent Networks, Minist Educ, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
deep reinforcement learning; data efficiency; curriculum learning; transfer learning; LEVEL; ENVIRONMENT; GO;
D O I
10.3390/electronics11244205
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data inefficiency is one of the major challenges for deploying deep reinforcement learning algorithms widely in industry control fields, especially in regard to long-horizon sparse reward tasks. Even in a simulation-based environment, it is often prohibitive to take weeks to train an algorithm. In this study, a data-efficient training method is proposed in which a DQN is used as a base algorithm, and an elaborate curriculum is designed for the agent in the simulation scenario to accelerate the training process. In the early stage of the training process, the distribution of the initial state is set close to the goal so the agent can obtain an informative reward easily. As the training continues, the initial state distribution is set farther from the goal for the agent to explore more state space. Thus, the agent can obtain a reasonable policy through fewer interactions with the environment. To bridge the sim-to-real gap, the parameters for the output layer of the neural network for the value function are fine-tuned. An experiment on UAV maneuver control is conducted in the proposed training framework to verify the method. We demonstrate that data efficiency is different for the same data in different training stages.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] A Data-Efficient Deep Learning Strategy for Tissue Characterization via Quantitative Ultrasound: Zone Training
    Soylu, Ufuk
    Oelze, Michael L.
    [J]. IEEE TRANSACTIONS ON ULTRASONICS FERROELECTRICS AND FREQUENCY CONTROL, 2023, 70 (05) : 368 - 377
  • [42] Data-Efficient Hierarchical Reinforcement Learning for Robotic Assembly Control Applications
    Hou, Zhimin
    Fei, Jiajun
    Deng, Yuelin
    Xu, Jing
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (11) : 11565 - 11575
  • [43] A Data-Efficient Deep Learning Method for Rough Surface Clutter Reduction in GPR Images
    Zhang, Yan
    Diao, Enmao
    Huston, Dryver
    Xia, Tian
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 10
  • [44] Data-efficient model-based reinforcement learning with trajectory discrimination
    Qu, Tuo
    Duan, Fuqing
    Zhang, Junge
    Zhao, Bo
    Huang, Wenzhen
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 1927 - 1936
  • [45] Data-Efficient Communication Traffic Prediction With Deep Transfer Learning
    Li, Hang
    Wang, Ju
    Chen, Xi
    Liu, Xue
    Dudek, Gregory
    [J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 3190 - 3195
  • [46] Data-Efficient Graph Learning
    Ding, Kaize
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22663 - 22663
  • [47] Biometric contrastive learning for data-efficient deep learning from electrocardiographic images
    Sangha, Veer
    Khunte, Akshay
    Holste, Gregory
    Mortazavi, Bobak J.
    Wang, Zhangyang
    Oikonomou, Evangelos K.
    Khera, Rohan
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (04) : 855 - 865
  • [48] Combining Synthetic Images and Deep Active Learning: Data-Efficient Training of an Industrial Object Detection Model
    Eversberg, Leon
    Lambrecht, Jens
    Wang, Guanghui
    [J]. JOURNAL OF IMAGING, 2024, 10 (01)
  • [49] Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs
    Feng, Guoxi
    Busoniu, Lucian
    Guerra, Thierry-Marie
    Mohammad, Sami
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9734 - 9744
  • [50] Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning
    Zhong, Rujie
    Zhang, Duohan
    Schafer, Lukas
    Albrecht, Stefano V.
    Hanna, Josiah P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,