Hierarchical Multi-Agent Skill Discovery

被引：0

作者：

Yang, Mingyu ^{[1
]}

Yang, Yaodong ^{[2
]}

Lu, Zhenbo ^{[3
]}

Zhou, Wengang ^{[1
,3
]}

Li, Houqiang ^{[1
,3
]}

机构：

[1] Univ Sci & Technol China, Chengdu, Sichuan, Peoples R China

[2] Peking Univ, Inst AI, Beijing, Peoples R China

[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Skill discovery has shown significant progress in unsupervised reinforcement learning. This approach enables the discovery of a wide range of skills without any extrinsic reward, which can be effectively combined to tackle complex tasks. However, such unsupervised skill learning has not been well applied to multi-agent reinforcement learning (MARL) due to two primary challenges. One is how to learn skills not only for the individual agents but also for the entire team, and the other is how to coordinate the skills of different agents to accomplish multi-agent tasks. To address these challenges, we present Hierarchical Multi-Agent Skill Discovery (HMASD), a two-level hierarchical algorithm for discovering both team and individual skills in MARL. The high-level policy employs a transformer structure to realize sequential skill assignment, while the low-level policy learns to discover valuable team and individual skills. We evaluate HMASD on sparse reward multi-agent benchmarks, and the results show that HMASD achieves significant performance improvements compared to strong MARL baselines.

引用

页数：18

共 50 条

[31] Performance Measure of Hierarchical Structures for Multi-agent Systems
Raza, Ali
Iqbal, Muhammad
Moon, Jun
Azuma, Shun-Ichi
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (03) : 780 - 788
[32] Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination
Han, Dongge
Bohmer, Wendelin
Wooldridge, Michael
Rogers, Alex
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11671 : 80 - 92
[33] An anthropological approach to the discovery of ontologies in multi-agent societies
Bordini, RH
Vieira, R
Campbell, JA
FOUNDATIONS AND APPLICATIONS OF MULTI-AGENT SYSTEMS, 2002, 2403 : 89 - 109
[34] Multi-agent system for resource discovery in Grid network
Puh, Maroje
Jezic, Gordan
Kusek, Mario
WET ICE 2007: 16TH IEEE INTERNATIONAL WORKSHOPS ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES, PROCEEDINGS, 2007, : 320 - 321
[35] Bayesian Policy Search for Multi-Agent Role Discovery
Wilson, Aaron
Fern, Alan
Tadepalli, Prasad
PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 624 - 629
[36] SEMANTIC RESOURCE DISCOVERY IN GRID AND MULTI-AGENT ENVIRONMENT
Al-Asfoor, Muntasir
Fasli, Maria
ICAART: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL. 2, 2012, : 366 - 370
[37] Dynamic pattern discovery using multi-agent technology
Rzevski, George
Skobelev, Peter
Minakov, Igor
Volman, Semen
PROCEEDINGS OF THE 6TH WSEAS INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND INFORMATICS (TELE-INFO '07)/ 6TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (SIP '07), 2007, : 75 - +
[38] ACTION DISCOVERY FOR SINGLE AND MULTI-AGENT REINFORCEMENT LEARNING
Banerjee, Bikramjit
Kraemer, Landon
ADVANCES IN COMPLEX SYSTEMS, 2011, 14 (02): : 279 - 305
[39] Hierarchical system architecture for multi-agent multi-modal systems
Koo, TJ
PROCEEDINGS OF THE 40TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2001, : 1509 - 1514
[40] An XML multi-agent system for e-learning and skill management
Garro, A
Palopoli, L
AGENT TECHNOLOGIES, INFRASTRUCTURES, TOOLS, AND APPLICATIONS FOR E-SERVICES, 2002, 2592 : 283 - 294

← 1 2 3 4 5 →