Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引:3
|
作者
Zhou, Yirui [1 ]
Zhang, Yangchun [1 ]
Liu, Xiaowei [1 ]
Wang, Wanying [1 ]
Che, Zhengping [2 ]
Xu, Zhiyuan [2 ]
Tang, Jian [2 ]
Peng, Yaxin [1 ]
机构
[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China
[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China
关键词
Generative adversarial imitation learning; Generalization; Computation; Policy classes;
D O I
10.1007/978-3-031-14714-2_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.
引用
收藏
页码:385 / 399
页数:15
相关论文
共 50 条
  • [1] Distributional generative adversarial imitation learning with reproducing kernel generalization
    Zhou, Yirui
    Lu, Mengxiao
    Liu, Xiaowei
    Che, Zhengping
    Xu, Zhiyuan
    Tang, Jian
    Zhang, Yangchun
    Peng, Yan
    Peng, Yaxin
    NEURAL NETWORKS, 2023, 165 : 43 - 59
  • [2] Generative Adversarial Imitation Learning
    Ho, Jonathan
    Ermon, Stefano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [3] Quantum generative adversarial imitation learning
    Xiao, Tailong
    Huang, Jingzheng
    Li, Hongjing
    Fan, Jianping
    Zeng, Guihua
    NEW JOURNAL OF PHYSICS, 2023, 25 (03):
  • [4] Deterministic generative adversarial imitation learning
    Zuo, Guoyu
    Chen, Kexin
    Lu, Jiahao
    Huang, Xiangsheng
    NEUROCOMPUTING, 2020, 388 : 60 - 69
  • [5] A Bayesian Approach to Generative Adversarial Imitation Learning
    Jeon, Wonseok
    Seo, Seokin
    Kim, Kee-Eung
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] HKGAIL: Policy shaping via integrating human knowledge with generative adversarial imitation learning
    Peng, Yanfei
    Tan, Guozhen
    Si, Huaiwei
    IET INTELLIGENT TRANSPORT SYSTEMS, 2023, 17 (07) : 1302 - 1311
  • [7] Robot Manipulation Learning Using Generative Adversarial Imitation Learning
    Jabri, Mohamed Khalil
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4893 - 4894
  • [8] A Survey of Imitation Learning Based on Generative Adversarial Nets
    Lin J.-H.
    Zhang Z.-Z.
    Jiang C.
    Hao J.-Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (02): : 326 - 351
  • [9] Generative Adversarial Imitation Learning from Failed Experiences
    Zhu, Jiacheng
    Lin, Jiahao
    Wang, Meng
    Chen, Yingfeng
    Fan, Changjie
    Jiang, Chong
    Zhang, Zongzhang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13997 - 13998
  • [10] Ranking-Based Generative Adversarial Imitation Learning
    Shi, Zhipeng
    Zhang, Xuehe
    Fang, Yu
    Li, Changle
    Liu, Gangfeng
    Zhao, Jie
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8967 - 8974