Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward

被引：1

作者：

Shao, Kun ^{[1
,2
]}

Zhu, Yuanheng ^{[1
]}

Tang, Zhentao ^{[1
,2
]}

Zhao, Dongbin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年

关键词：

reinforcement learning; deep reinforcement learning; cooperative games; counterfactual reward; LEVEL; GAME; GO;

D O I：

10.1109/ijcnn48605.2020.9207169

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In partially observable fully cooperative games, agents generally tend to maximize global rewards with joint actions, so it is difficult for each agent to deduce their own contribution. To address this credit assignment problem, we propose a multi-agent reinforcement learning algorithm with counterfactual reward mechanism, which is termed as CoRe algorithm. CoRe computes the global reward difference in condition that the agent does not take its actual action but takes other actions, while other agents fix their actual actions. This approach can determine each agent's contribution for the global reward. We evaluate CoRe in a simplified Pig Chase game with a decentralised Deep Q Network (DQN) framework. The proposed method helps agents learn end-to-end collaborative behaviors. Compared with other DQN variants with global reward, CoRe significantly improves learning efficiency and achieves better results. In addition, CoRe shows excellent performances in various size game environments.

引用

页数：8

共 50 条

[41] Deep reinforcement learning for multi-agent interaction
Ahmed, Ibrahim H.
Brewitt, Cillian
Carlucho, Ignacio
Christianos, Filippos
Dunion, Mhairi
Fosong, Elliot
Garcin, Samuel
Guo, Shangmin
Gyevnar, Balint
McInroe, Trevor
Papoudakis, Georgios
Rahman, Arrasy
Schafer, Lukas
Tamborski, Massimiliano
Vecchio, Giuseppe
Wang, Cheng
Albrecht, Stefano, V
[J]. AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
[42] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
Jiang, Haitian
Xiong, Dongliang
Jiang, Xiaowen
Yin, Aiguo
Ding, Li
Huang, Kai
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
[43] Multi-agent deep reinforcement learning: a survey
Sven Gronauer
Klaus Diepold
[J]. Artificial Intelligence Review, 2022, 55 : 895 - 943
[44] Multi-agent deep reinforcement learning: a survey
Gronauer, Sven
Diepold, Klaus
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
[45] Lenient Multi-Agent Deep Reinforcement Learning
Palmer, Gregory
Tuyls, Karl
Bloembergen, Daan
Savani, Rahul
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 443 - 451
[46] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
Foerster, Jakob N.
Assael, Yannis M.
de Freitas, Nando
Whiteson, Shimon
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[47] Deep Multi-Agent Reinforcement Learning: A Survey
Liang, Xing-Xing
Feng, Yang-He
Ma, Yang
Cheng, Guang-Quan
Huang, Jin-Cai
Wang, Qi
Zhou, Yu-Zhen
Liu, Zhong
[J]. Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (12): : 2537 - 2557
[48] Learning Cooperative Intrinsic Motivation in Multi-Agent Reinforcement Learning
Hong, Seung-Jin
Lee, Sang-Kwang
[J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1697 - 1699
[49] Cooperative Learning of Multi-Agent Systems Via Reinforcement Learning
Wang, Xin
Zhao, Chen
Huang, Tingwen
Chakrabarti, Prasun
Kurths, Juergen
[J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2023, 9 : 13 - 23
[50] Cooperative Multi-Agent Reinforcement Learning With Approximate Model Learning
Park, Young Joon
Lee, Young Jae
Kim, Seoung Bum
[J]. IEEE ACCESS, 2020, 8 : 125389 - 125400

← 1 2 3 4 5 →