Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引:3
|
作者
Zhou, Yirui [1 ]
Zhang, Yangchun [1 ]
Liu, Xiaowei [1 ]
Wang, Wanying [1 ]
Che, Zhengping [2 ]
Xu, Zhiyuan [2 ]
Tang, Jian [2 ]
Peng, Yaxin [1 ]
机构
[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China
[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China
关键词
Generative adversarial imitation learning; Generalization; Computation; Policy classes;
D O I
10.1007/978-3-031-14714-2_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.
引用
收藏
页码:385 / 399
页数:15
相关论文
共 50 条
  • [31] Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
    Wang, Wanying
    Zhu, Yichen
    Zhou, Yirui
    Shen, Chaomin
    Tang, Jian
    Xu, Zhiyuan
    Peng, Yaxin
    Zhang, Yangchun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15625 - 15633
  • [32] DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning
    Li, Ruiping
    Cheng, Xiang
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2642 - 2651
  • [33] Generative Adversarial Imitation Learning to Search in Branch-and-Bound Algorithms
    Wang, Qi
    Blackley, Suzanne, V
    Tang, Chunlei
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 673 - 680
  • [34] AugGAIL : Augmented generative adversarial imitation learning for robotic manipulation tasks
    Jung E.
    Lee S.
    Kim I.
    Journal of Institute of Control, Robotics and Systems, 2020, 26 (05) : 325 - 334
  • [35] When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
    Guan, Ziwei
    Xu, Tengyu
    Liang, Yingbin
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [36] Weight Adaptive Generative Adversarial Imitation Learning Based on Noise Contrastive Estimation
    Guan, Weifan
    Zhang, Xi
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (04): : 300 - 312
  • [37] xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis
    Pan, Menghai
    Huang, Weixiao
    Li, Yanhua
    Zhou, Xun
    Luo, Jun
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1334 - 1343
  • [38] Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems
    Hao, Xiaotian
    Wang, Weixun
    Hao, Jianye
    Yang, Yaodong
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1315 - 1323
  • [39] Dynamic Economic Dispatch of Power System Based on Generative Adversarial Imitation Learning
    Chen H.
    Meng F.
    Zhang Y.
    Sun Y.
    Zhang J.
    Shan L.
    Lü X.
    Zhang P.
    Dianwang Jishu/Power System Technology, 2022, 46 (11): : 4373 - 4380
  • [40] Collaborative Robot-Assisted Endovascular Catheterization with Generative Adversarial Imitation Learning
    Chi, Wenqiang
    Dagnino, Giulio
    Kwok, Trevor M. Y.
    Anh Nguyen
    Kundrat, Dennis
    Abdelaziz, Mohamed E. M. K.
    Riga, Celia
    Bicknell, Colin
    Yang, Guang-Zhong
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 2414 - 2420