Addressing Delays in Reinforcement Learning via Delayed Adversarial Imitation Learning

被引:2
|
作者
Xie, Minzhi [1 ]
Xia, Bo [1 ]
Yu, Yalou [1 ]
Wang, Xueqian [1 ]
Chang, Yongzhe [1 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518000, Peoples R China
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT III | 2023年 / 14256卷
关键词
Reinforcement Learning; Delays; Adversarial Imitation Learning;
D O I
10.1007/978-3-031-44213-1_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observation and action delays occur commonly in many real-world tasks which violate Markov property and consequently degrade the performance of Reinforcement Learning methods. So far, there have been several efforts on delays in RL. Model-based methods train forward models to predict unknown current information while model-free approaches focus on state-augmentation to define new Markov Decision Processes. However, previous works suffer from difficult model fine-tuning and the curse of dimensionality that prevent them from solving delays. Motivated by the advantage of imitation learning, a novel idea is introduced that a delayed policy can be trained by imitating undelayed expert demonstrations. Based on the idea, we propose an algorithm named Delayed Adversarial Imitation Learning (DAIL). In DAIL, a few undelayed expert demonstrations are utilized to generate a surrogate delayed expert and a delayed policy is trained by imitating the surrogate expert using adversarial imitation learning. Moreover, a theoretical analysis of DAIL is presented to validate the rationality of DAIL and guide the practical design of the approach. Finally, experiments on continuous control tasks demonstrate that DAIL achieves much higher performance than previous approaches in solving delays in RL, where DAIL can converge to high performance with an excellent sample efficiency, even for substantial delays, while previous works cannot due to the divergence problems.
引用
收藏
页码:271 / 282
页数:12
相关论文
共 50 条
  • [21] Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations
    Wang, Yunke
    Xu, Chang
    Du, Bo
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3155 - 3161
  • [22] Sample-Efficient Imitation Learning via Generative Adversarial Nets
    Blonde, Lionel
    Kalousis, Alexandros
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [23] Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations
    Li, Chenhao
    Vlastelica, Marin
    Blaes, Sebastian
    Frey, Jonas
    Grimminger, Felix
    Martius, Georg
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 342 - 352
  • [24] Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning
    Hua, Jiang
    Zeng, Liangcai
    Li, Gongfa
    Ju, Zhaojie
    SENSORS, 2021, 21 (04) : 1 - 21
  • [25] Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning
    Hua, Jiang
    Zeng, Liangcai
    Li, Gongfa
    Ju, Zhaojie
    Sensors (Switzerland), 2021, 21 (04): : 1 - 21
  • [26] Deep Adversarial Imitation Reinforcement Learning for QoS-Aware Cloud Job Scheduling
    Huang, Yifeng
    Cheng, Long
    Xue, Lianting
    Liu, Cong
    Li, Yuancheng
    Li, Jianbin
    Ward, Tomas
    IEEE SYSTEMS JOURNAL, 2022, 16 (03): : 4232 - 4242
  • [27] Optimizing Crop Management with Reinforcement Learning and Imitation Learning
    Tao, Ran
    Zhao, Pan
    Wu, Jing
    Martin, Nicolas
    Harrison, Matthew T.
    Ferreira, Carla
    Kalantari, Zahra
    Hovakimyan, Naira
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6228 - 6236
  • [28] Learning to Drive Using Sparse Imitation Reinforcement Learning
    Han, Yuci
    Yilmaz, Alper
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3736 - 3742
  • [29] Robot Manipulation Learning Using Generative Adversarial Imitation Learning
    Jabri, Mohamed Khalil
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4893 - 4894
  • [30] Robust Adversarial Reinforcement Learning
    Pinto, Lerrel
    Davidson, James
    Sukthankar, Rahul
    Gupta, Abhinav
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70