Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引：3

作者：

Zhou, Yirui ^{[1
]}

Zhang, Yangchun ^{[1
]}

Liu, Xiaowei ^{[1
]}

Wang, Wanying ^{[1
]}

Che, Zhengping ^{[2
]}

Xu, Zhiyuan ^{[2
]}

Tang, Jian ^{[2
]}

Peng, Yaxin ^{[1
]}

机构：

[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China

[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China

来源：

PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I | 2022年 / 13398卷

关键词：

Generative adversarial imitation learning; Generalization; Computation; Policy classes;

D O I：

10.1007/978-3-031-14714-2_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.

引用

页码：385 / 399

页数：15

共 50 条

[31] Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
Wang, Wanying
Zhu, Yichen
Zhou, Yirui
Shen, Chaomin
Tang, Jian
Xu, Zhiyuan
Peng, Yaxin
Zhang, Yangchun
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15625 - 15633
[32] DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning
Li, Ruiping
Cheng, Xiang
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2642 - 2651
[33] Generative Adversarial Imitation Learning to Search in Branch-and-Bound Algorithms
Wang, Qi
Blackley, Suzanne, V
Tang, Chunlei
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 673 - 680
[34] AugGAIL : Augmented generative adversarial imitation learning for robotic manipulation tasks
Jung E.
Lee S.
Kim I.
Journal of Institute of Control, Robotics and Systems, 2020, 26 (05) : 325 - 334
[35] When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
Guan, Ziwei
Xu, Tengyu
Liang, Yingbin
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[36] Weight Adaptive Generative Adversarial Imitation Learning Based on Noise Contrastive Estimation
Guan, Weifan
Zhang, Xi
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (04): : 300 - 312
[37] xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis
Pan, Menghai
Huang, Weixiao
Li, Yanhua
Zhou, Xun
Luo, Jun
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1334 - 1343
[38] Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems
Hao, Xiaotian
Wang, Weixun
Hao, Jianye
Yang, Yaodong
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1315 - 1323
[39] Dynamic Economic Dispatch of Power System Based on Generative Adversarial Imitation Learning
Chen H.
Meng F.
Zhang Y.
Sun Y.
Zhang J.
Shan L.
Lü X.
Zhang P.
Dianwang Jishu/Power System Technology, 2022, 46 (11): : 4373 - 4380
[40] Collaborative Robot-Assisted Endovascular Catheterization with Generative Adversarial Imitation Learning
Chi, Wenqiang
Dagnino, Giulio
Kwok, Trevor M. Y.
Anh Nguyen
Kundrat, Dennis
Abdelaziz, Mohamed E. M. K.
Riga, Celia
Bicknell, Colin
Yang, Guang-Zhong
2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 2414 - 2420

← 1 2 3 4 5 →