Robust Imitation of a Few Demonstrations with a Backwards Model

被引:0
|
作者
Park, Jung Yeon [1 ]
Wong, Lawson L. S. [1 ]
机构
[1] Northeastern Univ, Khoury Coll Comp Sci, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Behavior cloning of expert demonstrations can speed up learning optimal policies in a more sample-efficient way over reinforcement learning. However, the policy cannot extrapolate well to unseen states outside of the demonstration data, creating covariate shift (agent drifting away from demonstrations) and compounding errors. In this work, we tackle this issue by extending the region of attraction around the demonstrations so that the agent can learn how to get back onto the demonstrated trajectories if it veers off-course. We train a generative backwards dynamics model and generate short imagined trajectories from states in the demonstrations. By imitating both demonstrations and these model rollouts, the agent learns the demonstrated paths and how to get back onto these paths. With optimal or near-optimal demonstrations, the learned policy will be both optimal and robust to deviations, with a wider region of attraction. On continuous control domains, we evaluate the robustness when starting from different initial states unseen in the demonstration data. While both our method and other imitation learning baselines can successfully solve the tasks for initial states in the training distribution, our method exhibits considerably more robustness to different initial states.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Robust Imitation Learning from Noisy Demonstrations
    Tangkaratt, Voot
    Charoenphakdee, Nontawat
    Sugiyama, Masashi
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 298 - +
  • [2] A Novel Robust Imitation Learning Framework for Complex Skills With Limited Demonstrations
    Wang, Weiyong
    Zeng, Chao
    Zhan, Hong
    Yang, Chenguang
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024,
  • [3] Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations
    Wang, Yunke
    Xu, Chang
    Du, Bo
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3155 - 3161
  • [4] Model predictive optimization for imitation learning from demonstrations
    Hu, Yingbai
    Cui, Mingyang
    Duan, Jianghua
    Liu, Wenjun
    Huang, Dianye
    Knoll, Alois
    Chen, Guang
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 163
  • [5] The Art of Imitation: Learning Long-Horizon Manipulation Tasks From Few Demonstrations
    Von Hartz, Jan Ole
    Welschehold, Tim
    Valada, Abhinav
    Boedecker, Joschka
    [J]. IEEE Robotics and Automation Letters, 2024, 9 (12) : 11369 - 11376
  • [6] Immersive Demonstrations are the Key to Imitation Learning
    Li, Kelin
    Chappell, Digby
    Rojas, Nicolas
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5071 - 5077
  • [7] Imitation Learning with Demonstrations and Shaping Rewards
    Judah, Kshitij
    Fern, Alan
    Tadepalli, Prasad
    Goetschalckx, Robby
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1890 - 1896
  • [8] Model-based Adversarial Imitation Learning from Demonstrations and Human Reward
    Huang, Jie
    Hao, Jiangshan
    Juan, Rongshun
    Gomez, Randy
    Nakamura, Keisuke
    Li, Guangliang
    [J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 1683 - 1690
  • [9] Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
    Wang, Yunke
    Du, Bo
    Xu, Chang
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10262 - 10270
  • [10] Adversarial Imitation Learning from Incomplete Demonstrations
    Sun, Mingfei
    Xiaojuan
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3513 - 3519