Addressing Delays in Reinforcement Learning via Delayed Adversarial Imitation Learning

被引:2
|
作者
Xie, Minzhi [1 ]
Xia, Bo [1 ]
Yu, Yalou [1 ]
Wang, Xueqian [1 ]
Chang, Yongzhe [1 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518000, Peoples R China
关键词
Reinforcement Learning; Delays; Adversarial Imitation Learning;
D O I
10.1007/978-3-031-44213-1_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observation and action delays occur commonly in many real-world tasks which violate Markov property and consequently degrade the performance of Reinforcement Learning methods. So far, there have been several efforts on delays in RL. Model-based methods train forward models to predict unknown current information while model-free approaches focus on state-augmentation to define new Markov Decision Processes. However, previous works suffer from difficult model fine-tuning and the curse of dimensionality that prevent them from solving delays. Motivated by the advantage of imitation learning, a novel idea is introduced that a delayed policy can be trained by imitating undelayed expert demonstrations. Based on the idea, we propose an algorithm named Delayed Adversarial Imitation Learning (DAIL). In DAIL, a few undelayed expert demonstrations are utilized to generate a surrogate delayed expert and a delayed policy is trained by imitating the surrogate expert using adversarial imitation learning. Moreover, a theoretical analysis of DAIL is presented to validate the rationality of DAIL and guide the practical design of the approach. Finally, experiments on continuous control tasks demonstrate that DAIL achieves much higher performance than previous approaches in solving delays in RL, where DAIL can converge to high performance with an excellent sample efficiency, even for substantial delays, while previous works cannot due to the divergence problems.
引用
收藏
页码:271 / 282
页数:12
相关论文
共 50 条
  • [1] Delayed Reinforcement Learning by Imitation
    Liotet, Pierre
    Maran, Davide
    Bisi, Lorenzo
    Restelli, Marcello
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Adversarial Imitation Learning via Random Search
    Shin, MyungJae
    Kim, Joongheon
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [3] Addressing implicit bias in adversarial imitation learning with mutual information
    Zhang, Lihua
    Liu, Quan
    Zhu, Fei
    Huang, Zhigang
    NEURAL NETWORKS, 2023, 167 : 847 - 864
  • [4] Methodologies for Imitation Learning via Inverse Reinforcement Learning: A Review
    Zhang K.
    Yu Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (02): : 254 - 261
  • [5] Developing multi-agent adversarial environment using reinforcement learning and imitation learning
    Ziyao Han
    Yupeng Liang
    Kazuhiro Ohkura
    Artificial Life and Robotics, 2023, 28 : 703 - 709
  • [6] Developing multi-agent adversarial environment using reinforcement learning and imitation learning
    Han, Ziyao
    Liang, Yupeng
    Ohkura, Kazuhiro
    ARTIFICIAL LIFE AND ROBOTICS, 2023, 28 (04) : 703 - 709
  • [7] Multimodal Storytelling via Generative Adversarial Imitation Learning
    Chen, Zhiqian
    Zhang, Xuchao
    Boedihardjo, Arnold P.
    Dai, Jing
    Lu, Chang-Tien
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3967 - 3973
  • [8] Imitation and reinforcement learning
    Kober J.
    Peters J.
    IEEE Robotics and Automation Magazine, 2010, 17 (02): : 55 - 62
  • [9] Generative Adversarial Imitation Learning
    Ho, Jonathan
    Ermon, Stefano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [10] UAV Control Method Combining Reptile Meta-Reinforcement Learning and Generative Adversarial Imitation Learning
    Jiang, Shui
    Ge, Yanning
    Yang, Xu
    Yang, Wencheng
    Cui, Hui
    FUTURE INTERNET, 2024, 16 (03)