Adversarial Imitation Learning from Incomplete Demonstrations

被引:0
|
作者
Sun, Mingfei [1 ]
Xiaojuan [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning targets deriving a mapping from states to actions, a.k.a. policy, from expert demonstrations. Existing methods for imitation learning typically require any actions in the demonstrations to be fully available, which is hard to ensure in real applications. Though algorithms for learning with unobservable actions have been proposed, they focus solely on state information and overlook the fact that the action sequence could still be partially available and provide useful information for policy deriving. In this paper, we propose a novel algorithm called Action-Guided Adversarial Imitation Learning (AGAIL) that learns a policy from demonstrations with incomplete action sequences, i.e., incomplete demonstrations. The core idea of AGAIL is to separate demonstrations into state and action trajectories, and train a policy with state trajectories while using actions as auxiliary information to guide the training whenever applicable. Built upon the Generative Adversarial Imitation Learning, AGAIL has three components: a generator, a discriminator, and a guide. The generator learns a policy with rewards provided by the discriminator, which tries to distinguish state distributions between demonstrations and samples generated by the policy. The guide provides additional rewards to the generator when demonstrated actions for specific states are available. We compare AGAIL to other methods on benchmark tasks and show that AGAIL consistently delivers comparable performance to the state-of-the-art methods even when the action sequence in demonstrations is only partially available.
引用
收藏
页码:3513 / 3519
页数:7
相关论文
共 50 条
  • [1] Adversarial Imitation Learning from State-only Demonstrations
    Torabi, Faraz
    Warnell, Garrett
    Stone, Peter
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2229 - 2231
  • [2] Adversarial imitation learning with mixed demonstrations from multiple demonstrators
    Zuo, Guoyu
    Zhao, Qishen
    Huang, Shuai
    Li, Jiangeng
    Gong, Daoxiong
    NEUROCOMPUTING, 2021, 457 (457) : 365 - 376
  • [3] Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
    Wang, Yunke
    Du, Bo
    Xu, Chang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10262 - 10270
  • [4] Model-based Adversarial Imitation Learning from Demonstrations and Human Reward
    Huang, Jie
    Hao, Jiangshan
    Juan, Rongshun
    Gomez, Randy
    Nakamura, Keisuke
    Li, Guangliang
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 1683 - 1690
  • [5] Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations
    Wang, Yunke
    Xu, Chang
    Du, Bo
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3155 - 3161
  • [6] Efficient Off-policy Adversarial Imitation Learning with Imperfect Demonstrations
    Li, Jiangeng
    Zhao, Qishen
    Huang, Shuai
    Zuo, Guoyu
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1692 - 1697
  • [7] Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets
    Hausman, Karol
    Chebotar, Yevgen
    Schaal, Stefan
    Sukhatme, Gaurav
    Lim, Joseph J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [8] Robust Imitation Learning from Noisy Demonstrations
    Tangkaratt, Voot
    Charoenphakdee, Nontawat
    Sugiyama, Masashi
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 298 - +
  • [9] Programmatic Imitation Learning From Unlabeled and Noisy Demonstrations
    Xin, Jimmy
    Zheng, Linus
    Rahmani, Kia
    Wei, Jiayi
    Holtz, Jarrett
    Dillig, Isil
    Biswas, Joydeep
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 4894 - 4901
  • [10] Model predictive optimization for imitation learning from demonstrations
    Hu, Yingbai
    Cui, Mingyang
    Duan, Jianghua
    Liu, Wenjun
    Huang, Dianye
    Knoll, Alois
    Chen, Guang
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 163