Unified reinforcement Q-learning for mean field game and control problems

被引：24

作者：

Angiuli, Andrea ^{[1
]}

Fouque, Jean-Pierre ^{[1
]}

Lauriere, Mathieu ^{[2
]}

机构：

[1] Univ Calif Santa Barbara, Dept Stat & Appl Probabil, South Hall 5504, Santa Barbara, CA 93106 USA

[2] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA

来源：

MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS | 2022年 / 34卷 / 02期

关键词：

Q-learning; Mean field game; Mean field control; Timescales; Linear-quadratic control;

D O I：

10.1007/s00498-021-00310-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a Reinforcement Learning (RL) algorithm to solve infinite horizon asymptotic Mean Field Game (MFG) and Mean Field Control (MFC) problems. Our approach can be described as a unified two-timescale Mean Field Q-learning: The same algorithm can learn either the MFG or the MFC solution by simply tuning the ratio of two learning parameters. The algorithm is in discrete time and space where the agent not only provides an action to the environment but also a distribution of the state in order to take into account the mean field feature of the problem. Importantly, we assume that the agent cannot observe the population's distribution and needs to estimate it in a model-free manner. The asymptotic MFG and MFC problems are also presented in continuous time and space, and compared with classical (non-asymptotic or stationary) MFG and MFC problems. They lead to explicit solutions in the linear-quadratic (LQ) case that are used as benchmarks for the results of our algorithm.

引用

页码：217 / 271

页数：55

共 50 条

[41] The Improvement of Q-learning Applied to Imperfect Information Game
Lin, Jing
Wang, Xuan
Han, Lijiao
Zhang, Jiajia
Xi, Xinxin
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1562 - +
[42] Double Q-learning Agent for Othello Board Game
Somasundaram, Thamarai Selvi
Panneerselvam, Karthikeyan
Bhuthapuri, Tarun
Mahadevan, Harini
Jose, Ashik
2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 216 - 223
[43] Evolution of cooperation in the public goods game with Q-learning
Zheng, Guozhong
Zhang, Jiqiang
Deng, Shengfeng
Cai, Weiran
Chen, Li
CHAOS SOLITONS & FRACTALS, 2024, 188
[44] The Mean-Squared Error of Double Q-Learning
Weng, Wentao
Gupta, Harsh
He, Niao
Ying, Lei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[45] Reinforcement Q-learning based flight control for a passenger aircraft under actuator fault
Navid Mohammadi
Moein Ebrahimi
Morteza Tayefi
Amirali Nikkhah
Discover Mechanical Engineering, 4 (1):
[46] Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory
Zhao, Jin-Gang
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) : 1751 - 1759
[47] Active exploratory Q-learning for large problems
Wu, Xianghai
Kofman, Jonathan
Tizhoosh, Hamid R.
2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 4040 - 4045
[48] Human-like Autonomous Vehicle Speed Control by Deep Reinforcement Learning with Double Q-Learning
Zhang, Yi
Sun, Ping
Yin, Yuhan
Lin, Lin
Wang, Xuesong
2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2018, : 1251 - 1256
[49] Safe Q-Learning Approaches for Human-in-Loop Reinforcement Learning
Veerabathraswamy, Swathi
Bhatt, Nirav
2023 NINTH INDIAN CONTROL CONFERENCE, ICC, 2023, : 16 - 21
[50] Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
Weissenbacher, Matthias
Sinha, Samarth
Garg, Animesh
Kawahara, Yoshinobu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →