Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

被引:0
|
作者
Foerster, Jakob [1 ]
Nardelli, Nantas [1 ]
Farquhar, Gregory [1 ]
Afouras, Triantafyllos [1 ]
Torr, Philip H. S. [1 ]
Kohli, Pushmeet [2 ]
Whiteson, Shimon [1 ]
机构
[1] Univ Oxford, Oxford, England
[2] Microsoft Res, Redmond, WA USA
基金
英国工程与自然科学研究理事会; 欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micro-management confirm that these methods enable the successful combination of experience replay with multi-agent RL.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Parallelized Synchronous Multi-Agent Deep Reinforcement Learning with Experience Replay Memory
    Gong, Xudong
    Ding, Bo
    Xu, Jie
    Wang, Huaimin
    Zhou, Xing
    Feng, Dawei
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE) / 10TH INTERNATIONAL WORKSHOP ON JOINT CLOUD COMPUTING (JCC) / IEEE INTERNATIONAL WORKSHOP ON CLOUD COMPUTING IN ROBOTIC SYSTEMS (CCRS), 2019, : 325 - 330
  • [2] Robust experience replay sampling for multi-agent reinforcement learning
    Nicholaus, Isack Thomas
    Kang, Dae-Ki
    PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
  • [3] Experience Selection in Multi-Agent Deep Reinforcement Learning
    Wang, Yishen
    Zhang, Zongzhang
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 864 - 870
  • [4] Multi-Microgrid Energy Management Strategy Based on Multi-Agent Deep Reinforcement Learning with Prioritized Experience Replay
    Guo, Guodong
    Gong, Yanfeng
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [5] Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay
    Zhou, Yi
    Liu, Zhixiang
    Shi, Huaguang
    Li, Si
    Ning, Nianwen
    Liu, Fuqiang
    Gao, Xiaozhi
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (05) : 4887 - 4898
  • [6] Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay
    Yi Zhou
    Zhixiang Liu
    Huaguang Shi
    Si Li
    Nianwen Ning
    Fuqiang Liu
    Xiaozhi Gao
    Complex & Intelligent Systems, 2023, 9 : 4887 - 4898
  • [7] DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning
    Hu, Xunhan
    Zhao, Jian
    Zhou, Wengang
    Feng, Ruili
    Li, Houqiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer
    Jeon, Jeewon
    Kim, Woojun
    Jung, Whiyoung
    Sung, Youngchul
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10041 - 10052
  • [9] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Yin, Aiguo
    Ding, Li
    Huang, Kai
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
  • [10] Deep reinforcement learning for multi-agent interaction
    Ahmed, Ibrahim H.
    Brewitt, Cillian
    Carlucho, Ignacio
    Christianos, Filippos
    Dunion, Mhairi
    Fosong, Elliot
    Garcin, Samuel
    Guo, Shangmin
    Gyevnar, Balint
    McInroe, Trevor
    Papoudakis, Georgios
    Rahman, Arrasy
    Schafer, Lukas
    Tamborski, Massimiliano
    Vecchio, Giuseppe
    Wang, Cheng
    Albrecht, Stefano, V
    AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368