Bayesian Disturbance Injection: Robust imitation learning of flexible policies for robot manipulation

被引:6
|
作者
Oh, Hanbit [1 ]
Sasaki, Hikaru [1 ]
Michael, Brendan [1 ]
Matsubara, Takamitsu [1 ]
机构
[1] NAIST, Grad Sch Sci & Technol, Div Informat Sci, 8916-5,Takayama Cho, Ikoma City, Nara 6300192, Japan
关键词
Imitation learning; Disturbance injection; Human behavior characteristics; Robotic manipulation; GAUSSIAN-PROCESSES; TASK; SENSITIVITY;
D O I
10.1016/j.neunet.2022.11.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans demonstrate a variety of interesting behavioral characteristics when performing tasks, such as selecting between seemingly equivalent optimal actions, performing recovery actions when deviating from the optimal trajectory, or moderating actions in response to sensed risks. However, imitation learning, which attempts to teach robots to perform these same tasks from observations of human demonstrations, often fails to capture such behavior. Specifically, commonly used learning algorithms embody inherent contradictions between the learning assumptions (e.g., single optimal action) and actual human behavior (e.g., multiple optimal actions), thereby limiting robot generalizability, ap-plicability, and demonstration feasibility. To address this, this paper proposes designing imitation learning algorithms with a focus on utilizing human behavioral characteristics, thereby embodying principles for capturing and exploiting actual demonstrator behavioral characteristics. This paper presents the first imitation learning framework, Bayesian Disturbance Injection (BDI), that typifies human behavioral characteristics by incorporating model flexibility, robustification, and risk sensitivity. Bayesian inference is used to learn flexible non-parametric multi-action policies, while simultaneously robustifying policies by injecting risk-sensitive disturbances to induce human recovery action and ensuring demonstration feasibility. Our method is evaluated through risk-sensitive simulations and real-robot experiments (e.g., table-sweep task, shaft-reach task and shaft-insertion task) using the UR5e 6-DOF robotic arm, to demonstrate the improved characterization of behavior. Results show significant improvement in task performance, through improved flexibility, robustness as well as demonstration feasibility.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页码:42 / 58
页数:17
相关论文
共 32 条
  • [1] Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies
    Oh, Hanbit
    Sasaki, Hikaru
    Michael, Brendan
    Matsubara, Takamitsu
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 8629 - 8635
  • [2] Learning Robust Policies for Object Manipulation with Robot Swarms
    Gebhardt, Gregor H. W.
    Daun, Kevin
    Schnaubelt, Marius
    Neumann, Gerhard
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 7688 - 7695
  • [3] A Bayesian approach to imitation learning for robot navigation
    Ollis, Mark
    Huang, Wesley H.
    Happold, Michael
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 715 - 720
  • [4] Robot Manipulation Learning Using Generative Adversarial Imitation Learning
    Jabri, Mohamed Khalil
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4893 - 4894
  • [5] Robot learning by Single Shot Imitation for Manipulation Tasks
    Vohra, Mohit
    Behera, Laxmidhar
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] Disturbance Injection Under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks
    Tahara, Hirotaka
    Sasaki, Hikaru
    Oh, Hanbit
    Anarossi, Edgar
    Matsubara, Takamitsu
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2724 - 2731
  • [7] Bayesian Imitation Learning for End-to-End Mobile Manipulation
    Du, Yuqing
    Ho, Daniel
    Alemi, Alexander A.
    Jang, Eric
    Khansari, Mohi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Policy Gradient Bayesian Robust Optimization for Imitation Learning
    Javed, Zaynah
    Brown, Daniel S.
    Sharma, Satvik
    Zhu, Jerry
    Balakrishna, Ashwin
    Petrik, Marek
    Dragan, Anca D.
    Goldberg, Ken
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [9] Language-Conditioned Imitation Learning for Robot Manipulation Tasks
    Stepputtis, Simon
    Campbell, Joseph
    Phielipp, Mariano
    Lee, Stefan
    Baral, Chitta
    Ben Amor, Heni
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Disturbance-injected Robust Imitation Learning with Task Achievement
    Tahara, Hirotaka
    Sasaki, Hikaru
    Oh, Hanbit
    Michael, Brendan
    Matsubara, Takamitsu
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2466 - 2472