Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation

被引：2

作者：

Salimibeni, Mohammad ^{[1
]}

Mohammadi, Arash ^{[1
]}

Malekzadeh, Parvin ^{[2
]}

Plataniotis, Konstantinos N. ^{[2
]}

机构：

[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada

[2] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G8, Canada

来源：

SENSORS | 2022年 / 22卷 / 04期

关键词：

Kalman Temporal Difference; Multiple Model Adaptive Estimation; Multi-Agent Reinforcement Learning; Successor Representation; NEURAL-NETWORK; APPROXIMATION;

D O I：

10.3390/s22041393

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Development of distributed Multi-Agent Reinforcement Learning (MARL) algorithms has attracted an increasing surge of interest lately. Generally speaking, conventional Model-Based (MB) or Model-Free (MF) RL algorithms are not directly applicable to the MARL problems due to utilization of a fixed reward model for learning the underlying value function. While Deep Neural Network (DNN)-based solutions perform well, they are still prone to overfitting, high sensitivity to parameter selection, and sample inefficiency. In this paper, an adaptive Kalman Filter (KF)-based framework is introduced as an efficient alternative to address the aforementioned problems by capitalizing on unique characteristics of KF such as uncertainty modeling and online second order learning. More specifically, the paper proposes the Multi-Agent Adaptive Kalman Temporal Difference (MAK-TD) framework and its Successor Representation-based variant, referred to as the MAK-SR. The proposed MAK-TD/SR frameworks consider the continuous nature of the action-space that is associated with high dimensional multi-agent environments and exploit Kalman Temporal Difference (KTD) to address the parameter uncertainty. The proposed MAK-TD/SR frameworks are evaluated via several experiments, which are implemented through the OpenAI Gym MARL benchmarks. In these experiments, different number of agents in cooperative, competitive, and mixed (cooperative-competitive) scenarios are utilized. The experimental results illustrate superior performance of the proposed MAK-TD/SR frameworks compared to their state-of-the-art counterparts.

引用

页数：23

共 50 条

[31] Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition
Wang, Hongbign
Chen, Xin
Wu, Qin
Yu, Qi
Hu, Xingguo
Zheng, Zibin
Bouguettaya, Athman
[J]. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2017, 12 (02)
[32] Multi-agent reinforcement learning for adaptive demand response in smart cities
Vazquez-Canteli, Jose
Detjeen, Thomas
Henze, Gregor
Kampf, Jerome
Nagy, Zoltan
[J]. CLIMATE RESILIENT CITIES - ENERGY EFFICIENCY & RENEWABLES IN THE DIGITAL ERA (CISBAT 2019), 2019, 1343
[33] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
Chen, Hao
Yang, Guangkai
Zhang, Junge
Yin, Qiyue
Huang, Kaiqi
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[34] Learning to Share in Multi-Agent Reinforcement Learning
Yi, Yuxuan
Li, Ge
Wang, Yaowei
Lu, Zongqing
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[35] QDAP: Downsizing adaptive policy for cooperative multi-agent reinforcement learning
Zhao, Zhitong
Zhang, Ya
Wang, Siying
Zhang, Fan
Zhang, Malu
Chen, Wenyu
[J]. KNOWLEDGE-BASED SYSTEMS, 2024, 294
[36] Multi-Agent Reinforcement Learning With Distributed Targeted Multi-Agent Communication
Xu, Chi
Zhang, Hui
Zhang, Ya
[J]. 2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2915 - 2920
[37] Hierarchical multi-agent reinforcement learning
Mohammad Ghavamzadeh
Sridhar Mahadevan
Rajbala Makar
[J]. Autonomous Agents and Multi-Agent Systems, 2006, 13 : 197 - 229
[38] The Dynamics of Multi-Agent Reinforcement Learning
Dickens, Luke
Broda, Krysia
Russo, Alessandra
[J]. ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 367 - 372
[39] Multi-agent reinforcement learning: A survey
Busoniu, Lucian
Babuska, Robert
De Schutter, Bart
[J]. 2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
[40] Partitioning in multi-agent reinforcement learning
Sun, R
Peterson, T
[J]. FROM ANIMALS TO ANIMATS 6, 2000, : 325 - 332

← 1 2 3 4 5 →