Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems

被引：0

作者：

Hao, Xiaotian ^{[1
]}

Wang, Weixun ^{[1
]}

Hao, Jianye ^{[1
]}

Yang, Yaodong ^{[1
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China

来源：

AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS | 2019年

基金：

中国国家自然科学基金;

关键词：

Multiagent learning; Learning agent-to-agent interactions (coordination); Adversarial machine learning; NEURAL-NETWORKS;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Many tasks in practice require the collaboration of multiple agents through reinforcement learning. In general, cooperative multiagent reinforcement learning algorithms can be classified into two paradigms: Joint Action Learners (JALs) and Independent Learners (ILs). In many practical applications, agents are unable to observe other agents' actions and rewards, making JALs inapplicable. In this work, we focus on independent learning paradigm in which each agent makes decisions based on its local observations only. However, learning is challenging in independent settings due to the local viewpoints of all agents, which perceive the world as a non-stationary environment due to the concurrently exploring teammates. In this paper, we propose a novel framework called Independent Generative Adversarial Self-Imitation Learning (IGASIL) to address the coordination problems in fully cooperative multi agent environments. To the best of our knowledge, we are the first to combine self-imitation learning with generative adversarial imitation learning (GAIL) and apply it to cooperative multiagent systems. Besides, we put forward a Sub-Curriculum Experience Replay mechanism to pick out the past beneficial experiences as much as possible and accelerate the self-imitation learning process. Evaluations conducted in the testbed of StarCraft unit micromanagement and a commonly adopted benchmark show that our IGASIL produces state-of-the-art results and even outperforms JALs in terms of both convergence speed and final performance.

引用

页码：1315 / 1323

页数：9

共 50 条

[1] Self-Imitation Learning
Oh, Junhyuk
Guo, Yijie
Singh, Satinder
Lee, Honglak
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[2] Self-Imitation Learning by Planning
Luo, Sha
Kasaei, Hamidreza
Schomaker, Lambert
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4823 - 4829
[3] Learning Category-Level Generalizable Object Manipulation Policy Via Generative Adversarial Self-Imitation Learning From Demonstrations
Shen, Hao
Wan, Weikang
Wang, He
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11166 - 11173
[4] Generative Adversarial Imitation Learning
Ho, Jonathan
Ermon, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[5] Episodic Self-Imitation Learning with Hindsight
Dai, Tianhong
Liu, Hengyan
Bharath, Anil Anthony
ELECTRONICS, 2020, 9 (10) : 1 - 18
[6] Quantum generative adversarial imitation learning
Xiao, Tailong
Huang, Jingzheng
Li, Hongjing
Fan, Jianping
Zeng, Guihua
NEW JOURNAL OF PHYSICS, 2023, 25 (03):
[7] Deterministic generative adversarial imitation learning
Zuo, Guoyu
Chen, Kexin
Lu, Jiahao
Huang, Xiangsheng
NEUROCOMPUTING, 2020, 388 : 60 - 69
[8] Balancing Exploration and Exploitation in Self-imitation Learning
Kang, Chun-Yao
Chen, Ming-Syan
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 274 - 285
[9] Multiagent evacuation framework for a virtual fire emergency scenario based on generative adversarial imitation learning
Zhou, Wen
Jiang, Wenying
Jie, Biao
Bian, Weixin
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (01)
[10] A Bayesian Approach to Generative Adversarial Imitation Learning
Jeon, Wonseok
Seo, Seokin
Kim, Kee-Eung
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →