Generative subgoal oriented multi-agent reinforcement learning through potential field

被引：0

作者：

Li, Shengze ^{[1
]}

Jiang, Hao ^{[1
]}

Liu, Yuntao ^{[1
]}

Zhang, Jieyuan ^{[1
]}

Xu, Xinhai ^{[1
]}

Liu, Donghong ^{[1
]}

机构：

[1] Acad Mil Sci, Beijing 100000, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 179卷

基金：

中国国家自然科学基金;

关键词：

Multi-agent reinforcement learning; Subgoal generation; Potential field;

D O I：

10.1016/j.neunet.2024.106552

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-agent reinforcement learning (MARL) effectively improves the learning speed of agents in sparse reward tasks with the guide of subgoals. However, existing works sever the consistency of the learning objectives of the subgoal generation and subgoal reached stages, thereby significantly inhibiting the effectiveness of subgoal learning. To address this problem, we propose a novel Potential field Subgoal-based Multi-Agent reinforcement learning (PSMA) method, which introduces the potential field (PF) to unify the two-stage learning objectives. Specifically, we design a state-to-PF representation model that describes agents' states as potential fields, allowing easy measurement of the interaction effect for both allied and enemy agents. With the PF representation, a subgoal selector is designed to automatically generate multiple subgoals for each agent, drawn from the experience replay buffer that contains both individual and total PF values. Based on the determined subgoals, we define an intrinsic reward function to guide the agent to reach their respective subgoals while maximizing the joint action-value. Experimental results show that our method outperforms the state-of-the-art MARL method on both StarCraft II micro-management (SMAC) and Google Research Football (GRF) tasks with sparse reward settings.

引用

页数：11

共 50 条

[21] Multi-agent reinforcement learning: A survey
Busoniu, Lucian
Babuska, Robert
De Schutter, Bart
2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
[22] SPD: Synergy Pattern Diversifying Oriented Unsupervised Multi-agent Reinforcement Learning
Jiang, Yuhang
Shao, Jianzhun
He, Shuncheng
Zhang, Hongchang
Ji, Xiangyang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[23] Mutual information oriented deep skill chaining for multi-agent reinforcement learning
Xie, Zaipeng
Ji, Cheng
Qiao, Chentai
Song, Wenzhan
Li, Zewen
Zhang, Yufeng
Zhang, Yujing
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (04) : 1014 - 1030
[24] Multi-Agent Generative Adversarial Imitation Learning
Song, Jiaming
Ren, Hongyu
Sadigh, Dorsa
Ermon, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[25] Cooperative Assistance in Robotic Surgery through Multi-Agent Reinforcement Learning
Scheikl, Paul Maria
Gyenes, Balazs
Davitashvili, Tornike
Younis, Rayan
Schulze, Andre
Mueller-Stich, Beat P.
Neumann, Gerhard
Wagner, Martin
Mathis-Ullrich, Franziska
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 1859 - 1864
[26] Online optimization of traffic policy through multi-agent reinforcement learning
Sasaki, Y
Flann, NS
PROCEEDINGS OF THE 7TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2003, : 1211 - 1214
[27] MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning
Malysheva, Aleksandra
Kudenko, Daniel
Shpilman, Aleksei
2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 171 - 176
[28] TEAM POLICY LEARNING FOR MULTI-AGENT REINFORCEMENT LEARNING
Cassano, Lucas
Alghunaim, Sulaiman A.
Sayed, Ali H.
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3062 - 3066
[29] Aggregation Transfer Learning for Multi-Agent Reinforcement learning
Xu, Dongsheng
Qiao, Peng
Dou, Yong
2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021), 2021, : 547 - 551
[30] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
Foerster, Jakob N.
Assael, Yannis M.
de Freitas, Nando
Whiteson, Shimon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29

← 1 2 3 4 5 →