Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

被引：0

作者：

Liu, Zeyang ^{[1
]}

Wan, Lipeng ^{[1
]}

Yang, Xinrui ^{[1
]}

Chen, Zhuoran ^{[1
]}

Chen, Xingyu ^{[1
]}

Lan, Xuguang ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Natl Engn Res Ctr Visual Informat & Applicat, Natl Key Lab Human Machine Hybrid Augmented Intel, Xian, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16 | 2024年

基金：

国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effective exploration is crucial to discovering optimal strategies for multi-agent reinforcement learning (MARL) in complex coordination tasks. Existing methods mainly utilize intrinsic rewards to enable committed exploration or use rolebased learning for decomposing joint action spaces instead of directly conducting a collective search in the entire action-observation space. However, they often face challenges obtaining specific joint action sequences to reach successful states in long-horizon tasks. To address this limitation, we propose Imagine, Initialize, and Explore (IIE), a novel method that offers a promising solution for efficient multi-agent exploration in complex scenarios. IIE employs a transformer model to imagine how the agents reach a critical state that can influence each other's transition functions. Then, we initialize the environment at this state using a simulator before the exploration phase. We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively. The prompt consists of timestep-to-go, return-togo, influence value, and one-shot demonstration, specifying the desired state and trajectory as well as guiding the action generation. By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important under-explored regions. Despite its simplicity, empirical results demonstrate that our method outperforms multi-agent exploration baselines on the StarCraft Multi-Agent Challenge (SMAC) and SMACv2 environments. Particularly, IIE shows improved performance in the sparse-reward SMAC tasks and produces more effective curricula over the initialized states than other generative methods, such as CVAE-GAN and diffusion models.

引用

页码：17487 / 17495

页数：9

共 50 条

[1] Multi-agent Exploration with Reinforcement Learning
Sygkounas, Alkis
Tsipianitis, Dimitris
Nikolakopoulos, George
Bechlioulis, Charalampos P.
[J]. 2022 30TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2022, : 630 - 635
[2] Diverse Effective Relationship Exploration for Cooperative Multi-Agent Reinforcement Learning
Jiang, Hao
Liu, Yuntao
Li, Shengze
Zhang, Jieyuan
Xu, Xinhai
Liu, Donghong
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 842 - 851
[3] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Liu, Iou-Jen
Jain, Unnat
Yeh, Raymond A.
Schwing, Alexander G.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] Adaptive Average Exploration in Multi-Agent Reinforcement Learning
Hall, Garrett
Holladay, Ken
[J]. 2020 AIAA/IEEE 39TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC) PROCEEDINGS, 2020,
[5] A Multi-agent Reinforcement Learning Method for Swarm Robots in Space Collaborative Exploration
Huang, Yixin
Wu, Shufan
Mu, Zhongcheng
Long, Xiangyu
Chu, Sunhao
Zhao, Guohong
[J]. 2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR), 2020, : 139 - 144
[6] Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning
Graham, Caoimhin
Bell, David
Luo, Zhihui
[J]. RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 293 - 298
[7] Strangeness-driven exploration in multi-agent reinforcement learning
Kim, Ju-Bong
Choi, Ho-Bin
Han, Youn-Hee
[J]. NEURAL NETWORKS, 2024, 172
[8] Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods
Wang Qisheng
Wang Qichao
Li Xiao
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13949 - 13950
[9] Action Prediction for Cooperative Exploration in Multi-agent Reinforcement Learning
Zhang, Yanqiang
Feng, Dawei
Ding, Bo
[J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 358 - 372
[10] UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
Gupta, Tarun
Mahajan, Anuj
Peng, Bei
Bohmer, Wendelin
Whiteson, Shimon
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139

← 1 2 3 4 5 →