Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

被引：0

作者：

Foerster, Jakob ^{[1
]}

Nardelli, Nantas ^{[1
]}

Farquhar, Gregory ^{[1
]}

Afouras, Triantafyllos ^{[1
]}

Torr, Philip H. S. ^{[1
]}

Kohli, Pushmeet ^{[2
]}

Whiteson, Shimon ^{[1
]}

机构：

[1] Univ Oxford, Oxford, England

[2] Microsoft Res, Redmond, WA USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷

基金：

英国工程与自然科学研究理事会; 欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micro-management confirm that these methods enable the successful combination of experience replay with multi-agent RL.

引用

页数：10

共 50 条

[1] Parallelized Synchronous Multi-Agent Deep Reinforcement Learning with Experience Replay Memory
Gong, Xudong
Ding, Bo
Xu, Jie
Wang, Huaimin
Zhou, Xing
Feng, Dawei
2019 13TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE) / 10TH INTERNATIONAL WORKSHOP ON JOINT CLOUD COMPUTING (JCC) / IEEE INTERNATIONAL WORKSHOP ON CLOUD COMPUTING IN ROBOTIC SYSTEMS (CCRS), 2019, : 325 - 330
[2] Robust experience replay sampling for multi-agent reinforcement learning
Nicholaus, Isack Thomas
Kang, Dae-Ki
PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
[3] Experience Selection in Multi-Agent Deep Reinforcement Learning
Wang, Yishen
Zhang, Zongzhang
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 864 - 870
[4] Multi-Microgrid Energy Management Strategy Based on Multi-Agent Deep Reinforcement Learning with Prioritized Experience Replay
Guo, Guodong
Gong, Yanfeng
APPLIED SCIENCES-BASEL, 2023, 13 (05):
[5] Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay
Zhou, Yi
Liu, Zhixiang
Shi, Huaguang
Li, Si
Ning, Nianwen
Liu, Fuqiang
Gao, Xiaozhi
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (05) : 4887 - 4898
[6] Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay
Yi Zhou
Zhixiang Liu
Huaguang Shi
Si Li
Nianwen Ning
Fuqiang Liu
Xiaozhi Gao
Complex & Intelligent Systems, 2023, 9 : 4887 - 4898
[7] DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning
Hu, Xunhan
Zhao, Jian
Zhou, Wengang
Feng, Ruili
Li, Houqiang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
Jeon, Jeewon
Kim, Woojun
Jung, Whiyoung
Sung, Youngchul
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10041 - 10052
[9] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
Jiang, Haitian
Xiong, Dongliang
Jiang, Xiaowen
Yin, Aiguo
Ding, Li
Huang, Kai
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
[10] Deep reinforcement learning for multi-agent interaction
Ahmed, Ibrahim H.
Brewitt, Cillian
Carlucho, Ignacio
Christianos, Filippos
Dunion, Mhairi
Fosong, Elliot
Garcin, Samuel
Guo, Shangmin
Gyevnar, Balint
McInroe, Trevor
Papoudakis, Georgios
Rahman, Arrasy
Schafer, Lukas
Tamborski, Massimiliano
Vecchio, Giuseppe
Wang, Cheng
Albrecht, Stefano, V
AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368

← 1 2 3 4 5 →