Cooperative Multi-Agent Q-Learning Using Distributed MPC

被引：2

作者：

Esfahani, Hossein Nejatbakhsh ^{[1
]}

Velni, Javad Mohammadpour ^{[1
]}

机构：

[1] Clemson Univ, Dept Mech Engn, Clemson, SC 29634 USA

来源：

IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷

基金：

美国国家科学基金会;

关键词：

Q-learning; Approximation algorithms; Couplings; Costs; Predictive control; Multi-agent systems; Linear programming; Multi-agent Q-Learning; distributed MPC; cooperative control;

D O I：

10.1109/LCSYS.2024.3407632

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this letter, we propose a cooperative Multi-Agent Reinforcement Learning (MARL) approach based on Distributed Model Predictive Control (DMPC). In the proposed framework, the local MPC schemes are formulated based on the dual decomposition method in the context of DMPC and will be used to derive the local state (and action) value functions required in a cooperative Q-learning algorithm. We further show that the DMPC scheme can yield a framework based on the Value Function Decomposition (VFD) principle so that the global state (and action) value functions can be decomposed into several local state (and action) value functions captured from the local MPCs. In the proposed cooperative MARL, the coordination between individual agents is then achieved based on the multiplier-sharing step, a.k.a inter-agent negotiation in the DMPC scheme.

引用

页码：2193 / 2198

页数：6

共 50 条

[1] A novel multi-agent Q-learning algorithm in cooperative multi-agent system
Ou, HT
Zhang, WD
Zhang, WY
Xu, XM
PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 272 - 276
[2] Cooperative behavior acquisition for multi-agent systems by Q-learning
Xie, M. C.
Tachibana, A.
2007 IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE, VOLS 1 AND 2, 2007, : 424 - +
[3] A theoretical analysis of cooperative behaviorin multi-agent Q-learning
Waltman, Ludo
Kaymak, Uzay
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 84 - +
[4] Minimax fuzzy Q-learning in cooperative multi-agent systems
Kilic, A
Arslan, A
ADVANCES IN INFORMATION SYSTEMS, 2002, 2457 : 264 - 272
[5] A distributed Q-learning algorithm for multi-agent team coordination
Huang, J
Yang, B
Liu, DY
Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 108 - 113
[6] Pricing in agent economies using multi-agent Q-learning
Tesauro, G
Kephart, JO
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2002, 5 (03) : 289 - 304
[7] Pricing in Agent Economies Using Multi-Agent Q-Learning
Gerald Tesauro
Jeffrey O. Kephart
Autonomous Agents and Multi-Agent Systems, 2002, 5 : 289 - 304
[8] Q-learning in Multi-Agent Cooperation
Hwang, Kao-Shing
Chen, Yu-Jen
Lin, Tzung-Feng
2008 IEEE WORKSHOP ON ADVANCED ROBOTICS AND ITS SOCIAL IMPACTS, 2008, : 239 - 244
[9] Multi-Agent Advisor Q-Learning
Subramanian S.G.
Taylor M.E.
Larson K.
Crowley M.
Journal of Artificial Intelligence Research, 2022, 74 : 1 - 74
[10] Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning
Graham, Caoimhin
Bell, David
Luo, Zhihui
RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVI: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XVII, 2010, : 293 - 298

← 1 2 3 4 5 →