Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation

被引:2
|
作者
Salimibeni, Mohammad [1 ]
Mohammadi, Arash [1 ]
Malekzadeh, Parvin [2 ]
Plataniotis, Konstantinos N. [2 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Montreal, PQ H3G 1M8, Canada
[2] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G8, Canada
关键词
Kalman Temporal Difference; Multiple Model Adaptive Estimation; Multi-Agent Reinforcement Learning; Successor Representation; NEURAL-NETWORK; APPROXIMATION;
D O I
10.3390/s22041393
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Development of distributed Multi-Agent Reinforcement Learning (MARL) algorithms has attracted an increasing surge of interest lately. Generally speaking, conventional Model-Based (MB) or Model-Free (MF) RL algorithms are not directly applicable to the MARL problems due to utilization of a fixed reward model for learning the underlying value function. While Deep Neural Network (DNN)-based solutions perform well, they are still prone to overfitting, high sensitivity to parameter selection, and sample inefficiency. In this paper, an adaptive Kalman Filter (KF)-based framework is introduced as an efficient alternative to address the aforementioned problems by capitalizing on unique characteristics of KF such as uncertainty modeling and online second order learning. More specifically, the paper proposes the Multi-Agent Adaptive Kalman Temporal Difference (MAK-TD) framework and its Successor Representation-based variant, referred to as the MAK-SR. The proposed MAK-TD/SR frameworks consider the continuous nature of the action-space that is associated with high dimensional multi-agent environments and exploit Kalman Temporal Difference (KTD) to address the parameter uncertainty. The proposed MAK-TD/SR frameworks are evaluated via several experiments, which are implemented through the OpenAI Gym MARL benchmarks. In these experiments, different number of agents in cooperative, competitive, and mixed (cooperative-competitive) scenarios are utilized. The experimental results illustrate superior performance of the proposed MAK-TD/SR frameworks compared to their state-of-the-art counterparts.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features
    Wenzhang Liu
    Lu Dong
    Dan Niu
    Changyin Sun
    [J]. IEEE/CAA Journal of Automatica Sinica, 2022, 9 (09) : 1673 - 1686
  • [2] Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features
    Liu, Wenzhang
    Dong, Lu
    Niu, Dan
    Sun, Changyin
    [J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (09) : 1673 - 1686
  • [3] Multi-Agent Cognition Difference Reinforcement Learning for Multi-Agent Cooperation
    Wang, Huimu
    Qiu, Tenghai
    Liu, Zhen
    Pu, Zhiqiang
    Yi, Jianqiang
    Yuan, Wanmai
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] Adaptive and Dynamic Service Composition via Multi-agent reinforcement learning
    Wang, Hongbing
    Wu, Qin
    Chen, Xin
    Yu, Qi
    Zheng, Zibin
    Bouguettaya, Athman
    [J]. 2014 IEEE 21ST INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2014), 2014, : 447 - 454
  • [5] Temporal Abstraction in Reinforcement Learning with the Successor Representation
    Machado, Marlos C.
    Barreto, Andre
    Precup, Doina
    Bowling, Michael
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [6] Multi-agent reinforcement learning with adaptive mimetism
    Yamaguchi, T
    Miura, M
    Yachida, M
    [J]. ETFA '96 - 1996 IEEE CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, VOLS 1 AND 2, 1996, : 288 - 294
  • [7] Explainable Multi-Agent Reinforcement Learning for Temporal Queries
    Boggess, Kayla
    Kraus, Sarit
    Feng, Lu
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 55 - 63
  • [8] Adaptive mean field multi-agent reinforcement learning
    Wang, Xiaoqiang
    Ke, Liangjun
    Zhang, Gewei
    Zhu, Dapeng
    [J]. INFORMATION SCIENCES, 2024, 669
  • [9] ADAPTIVE STATE REPRESENTATIONS FOR MULTI-AGENT REINFORCEMENT LEARNING
    De Hauwere, Yann-Michael
    Vrancx, Peter
    Nowe, Ann
    [J]. ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2011, : 181 - 189
  • [10] Adaptive Average Exploration in Multi-Agent Reinforcement Learning
    Hall, Garrett
    Holladay, Ken
    [J]. 2020 AIAA/IEEE 39TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC) PROCEEDINGS, 2020,