A Model-Based Method for Learning Locomotion Skills from Demonstration

被引:2
|
作者
Park, Hyunseong [1 ]
Yoon, Sukmin [1 ]
Kim, Yong-Duk [1 ]
机构
[1] Agcy Def Dev, Dajeon, South Korea
关键词
D O I
10.1109/SMC52423.2021.9658875
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While Generative Adversarial Imitation Learning (GAIL) shows remarkable performance in many high dimensional imitation learning tasks, it requires too many sampled transitions, which are infeasible for some real world problems. In this paper, we demonstrate how exploiting the reward function in GAIL can improve sample efficiency. We design our algorithm end-to-end differentiable so that the learned reward function can directly participate in policy updates. End-to-end differentiability can be achieved by introducing a forward model of the environment, enabling direct calculation of the cumulative reward function. However, using a forward model has two significant limitations that it heavily relies on the performance of the forward model and requires multi-step prediction, which causes severe error accumulation. The proposed end-to-end differentiable adversarial imitation learning algorithm alleviates these limitations. Also, we suggest applying several existing regularization techniques for robust training of a forward model. We call our algorithm, integrated with these regularization methods, fully Differentiable Regularized GAIL (DRGAIL), and test DRGAIL on continuous control tasks.
引用
收藏
页码:327 / 332
页数:6
相关论文
共 50 条
  • [1] Learning Locomotion Skills via Model-based Proximal Meta-Reinforcement Learning
    Xiao, Qing
    Cao, Zhengcai
    Zhou, Mengchu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 1545 - 1550
  • [2] Learning from demonstration with model-based Gaussian process
    Jaquier, Noemie
    Ginsbourger, David
    Calinon, Sylvain
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [3] Learning hybrid locomotion skills-Learn to exploit residual actions and modulate model-based gait control
    Kasaei, Mohammadreza
    Abreu, Miguel
    Lau, Nuno
    Pereira, Artur
    Reis, Luis Paulo
    Li, Zhibin
    [J]. FRONTIERS IN ROBOTICS AND AI, 2023, 10
  • [4] Online Learning of Unknown Dynamics for Model-Based Controllers in Legged Locomotion
    Sun, Yu
    Ubellacker, Wyatt L.
    Ma, Wen-Loong
    Zhang, Xiang
    Wang, Changhao
    Csomay-Shanklin, Noel V.
    Tomizuka, Masayoshi
    Sreenath, Koushil
    Ames, Aaron D.
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04): : 8442 - 8449
  • [5] Deep Adversarial Imitation Learning of Locomotion Skills from One-shot Video Demonstration
    Zhang, Huiwen
    Liu, Yuwang
    Zhou, Weijia
    [J]. 2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1257 - 1261
  • [6] COMPLIANT LOCOMOTION: A MODEL-BASED APPROACH
    Hopkins, Michael
    Griffin, Robert
    Lednessa, Alexander
    [J]. MECHANICAL ENGINEERING, 2015, 137 (06):
  • [7] Model-Based Action Exploration for Learning Dynamic Motion Skills
    Berseth, Glen
    Kyriazis, Alex
    Zinin, Ivan
    Choi, William
    van de Panne, Michiel
    [J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 1540 - 1546
  • [8] Learning from demonstration and adaptation of biped locomotion
    Nakanishi, J
    Morimoto, J
    Endo, G
    Cheng, G
    Schaal, S
    Kawato, M
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2004, 47 (2-3) : 79 - 91
  • [9] Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace
    Chen, Lingjiao
    Wang, Hongyi
    Chen, Leshang
    Koutris, Paraschos
    Kumar, Arun
    [J]. SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1885 - 1888
  • [10] Combining Learning-Based Locomotion Policy With Model-Based Manipulation for Legged Mobile Manipulators
    Ma, Yuntao
    Farshidian, Farbod
    Miki, Takahiro
    Lee, Joonho
    Hutter, Marco
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02): : 2377 - 2384