Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain

被引：0

作者：

Cao, Jingyu ^{[1
,3
]}

Dong, Lu ^{[2
]}

Yuan, Xin ^{[1
]}

Wang, Yuanda ^{[1
]}

Sun, Changyin ^{[1
,3
]}

机构：

[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China

[2] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China

[3] Peng Cheng Lab, Shenzhen 518055, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Sparse reward; Cooperative multi-agent systems; Hierarchical framework; Two-stream structure;

D O I：

10.1007/s00521-023-08882-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The sparse reward problem has long been one of the most challenging topics in the application of reinforcement learning (RL), especially in complex multi-agent systems. In this paper, a hierarchical multi-agent RL architecture is developed to address the sparse reward problem of cooperative tasks in continuous domain. The proposed architecture is divided into two levels: the higher-level meta-agent implements state transitions on a larger time scale to alleviate the sparse reward problem, which receives global observation as spatial information and formulates sub-goals for the lower-level agents; the lower-level agent receives local observation and sub-goal and completes the cooperative tasks. In addition, to improve the stability of the higher-level policy, a channel is built to transmit the lower-level policy to the meta-agent as temporal information, and then a two-stream structure is adopted in the actor-critic networks of the meta-agent to process spatial and temporal information. Simulation experiments on different tasks demonstrate that the proposed algorithm effectively alleviates the sparse reward problem, so as to learn desired cooperative policies.

引用

页码：273 / 287

页数：15

共 50 条

[41] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Liu, Iou-Jen
Jain, Unnat
Yeh, Raymond A.
Schwing, Alexander G.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[42] Reinforcement learning of coordination in cooperative multi-agent systems
Kapetanakis, S
Kudenko, D
[J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 326 - 331
[43] Training Cooperative Agents for Multi-Agent Reinforcement Learning
Bhalla, Sushrut
Subramanian, Sriram G.
Crowley, Mark
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1826 - 1828
[44] Studies on hierarchical reinforcement learning in multi-agent environment
Yu Lasheng
Marin, Alonso
Hong Fei
Lin Jian
[J]. PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL, VOLS 1 AND 2, 2008, : 1714 - 1720
[45] Multi-Agent Hierarchical Reinforcement Learning with Dynamic Termination
Han, Dongge
Boehmer, Wendelin
Wooldridge, Michael
Rogers, Alex
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2006 - 2008
[46] Multi-agent hierarchical reinforcement learning for energy management
Jendoubi, Imen
Bouffard, Francois
[J]. APPLIED ENERGY, 2023, 332
[47] Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination
Han, Dongge
Bohmer, Wendelin
Wooldridge, Michael
Rogers, Alex
[J]. PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11671 : 80 - 92
[48] Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation
Yan, Jiangyue
Luo, Biao
Xu, Xiaodong
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)
[49] Hierarchical relationship modeling in multi-agent reinforcement learning for mixed cooperative-competitive environments
Xie, Shaorong
Li, Yang
Wang, Xinzhi
Zhang, Han
Zhang, Zhenyu
Luo, Xiangfeng
Yu, Hang
[J]. INFORMATION FUSION, 2024, 108
[50] Networked Multi-Agent Reinforcement Learning in Continuous Spaces
Zhang, Kaiqing
Yang, Zhuoran
Basar, Tamer
[J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2771 - 2776

← 1 2 3 4 5 →