Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward

被引:1
|
作者
Shao, Kun [1 ,2 ]
Zhu, Yuanheng [1 ]
Tang, Zhentao [1 ,2 ]
Zhao, Dongbin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
关键词
reinforcement learning; deep reinforcement learning; cooperative games; counterfactual reward; LEVEL; GAME; GO;
D O I
10.1109/ijcnn48605.2020.9207169
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In partially observable fully cooperative games, agents generally tend to maximize global rewards with joint actions, so it is difficult for each agent to deduce their own contribution. To address this credit assignment problem, we propose a multi-agent reinforcement learning algorithm with counterfactual reward mechanism, which is termed as CoRe algorithm. CoRe computes the global reward difference in condition that the agent does not take its actual action but takes other actions, while other agents fix their actual actions. This approach can determine each agent's contribution for the global reward. We evaluate CoRe in a simplified Pig Chase game with a decentralised Deep Q Network (DQN) framework. The proposed method helps agents learn end-to-end collaborative behaviors. Compared with other DQN variants with global reward, CoRe significantly improves learning efficiency and achieves better results. In addition, CoRe shows excellent performances in various size game environments.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Deep reinforcement learning for multi-agent interaction
    Ahmed, Ibrahim H.
    Brewitt, Cillian
    Carlucho, Ignacio
    Christianos, Filippos
    Dunion, Mhairi
    Fosong, Elliot
    Garcin, Samuel
    Guo, Shangmin
    Gyevnar, Balint
    McInroe, Trevor
    Papoudakis, Georgios
    Rahman, Arrasy
    Schafer, Lukas
    Tamborski, Massimiliano
    Vecchio, Giuseppe
    Wang, Cheng
    Albrecht, Stefano, V
    [J]. AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
  • [42] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Yin, Aiguo
    Ding, Li
    Huang, Kai
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
  • [43] Multi-agent deep reinforcement learning: a survey
    Sven Gronauer
    Klaus Diepold
    [J]. Artificial Intelligence Review, 2022, 55 : 895 - 943
  • [44] Multi-agent deep reinforcement learning: a survey
    Gronauer, Sven
    Diepold, Klaus
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
  • [45] Lenient Multi-Agent Deep Reinforcement Learning
    Palmer, Gregory
    Tuyls, Karl
    Bloembergen, Daan
    Savani, Rahul
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 443 - 451
  • [46] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob N.
    Assael, Yannis M.
    de Freitas, Nando
    Whiteson, Shimon
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [47] Deep Multi-Agent Reinforcement Learning: A Survey
    Liang, Xing-Xing
    Feng, Yang-He
    Ma, Yang
    Cheng, Guang-Quan
    Huang, Jin-Cai
    Wang, Qi
    Zhou, Yu-Zhen
    Liu, Zhong
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (12): : 2537 - 2557
  • [48] Learning Cooperative Intrinsic Motivation in Multi-Agent Reinforcement Learning
    Hong, Seung-Jin
    Lee, Sang-Kwang
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1697 - 1699
  • [49] Cooperative Learning of Multi-Agent Systems Via Reinforcement Learning
    Wang, Xin
    Zhao, Chen
    Huang, Tingwen
    Chakrabarti, Prasun
    Kurths, Juergen
    [J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2023, 9 : 13 - 23
  • [50] Cooperative Multi-Agent Reinforcement Learning With Approximate Model Learning
    Park, Young Joon
    Lee, Young Jae
    Kim, Seoung Bum
    [J]. IEEE ACCESS, 2020, 8 : 125389 - 125400