Discrete-time dynamic graphical games: model-free reinforcement learning solution

被引:55
|
作者
Abouheaf M.I. [1 ]
Lewis F.L. [2 ,3 ]
Mahmoud M.S. [1 ]
Mikulski D.G. [4 ]
机构
[1] Systems Engineering Department, King Fahd University of Petroleum & Mineral, Dhahran
[2] UTA Research Institute, University of Texas at Arlington, Fort Worth, TX
[3] State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang Liaoning
[4] Ground Vehicle Robotics (GVR), U.S. Army TARDEC, Warren, MI
[5] King Fahd University of Petroleum & Mineral, P.O. Box 1956, Dhahran
来源
Control theory technol. | / 1卷 / 55-69期
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
discrete mechanics; Dynamic graphical games; model-free reinforcement learning; Nash equilibrium; optimal control; policy iteration;
D O I
10.1007/s11768-015-3203-x
中图分类号
学科分类号
摘要
This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from multi-agent dynamical systems, where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents’ dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time. © 2015, South China University of Technology, Academy of Mathematics and Systems Science, Chinese Academy of Sciences and Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:55 / 69
页数:14
相关论文
共 50 条
  • [1] Discrete-time dynamic graphical games:model-free reinforcement learning solution
    Mohammed I.ABOUHEAF
    Frank L.LEWIS
    Magdi S.MAHMOUD
    Dariusz G.MIKULSKI
    [J]. Control Theory and Technology, 2015, 13 (01) : 55 - 69
  • [2] Model-Free Adaptive Learning Solutions for Discrete-Time Dynamic Graphical Games
    Abouheaf, Mohammed I.
    Lewis, Frank L.
    Mahmoud, Magdi S.
    [J]. 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 3578 - 3583
  • [3] Model-Free Value Iteration Solution for Dynamic Graphical Games
    Abouheaf, Mohammed
    Gueaieb, Wail
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (CIVEMSA), 2018,
  • [4] Multi-agent discrete-time graphical games and reinforcement learning solutions
    Abouheaf, Mohammed I.
    Lewis, Frank L.
    Vamvoudakis, Kyriakos G.
    Haesaert, Sofie
    Babuska, Robert
    [J]. AUTOMATICA, 2014, 50 (12) : 3038 - 3053
  • [5] Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning
    Ma, Xiao
    Yuan, Yuan
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (07):
  • [6] Model-Free Solution to the Discrete-Time Coupled Riccati Equation Using Off-Policy Reinforcement Learning
    Li, Lu
    Wang, Liming
    Yang, Yongliang
    Dong, Jie
    Yin, Yixin
    Cheng, Shusen
    [J]. PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 6813 - 6818
  • [7] Model-free aperiodic tracking for discrete-time systems using hierarchical reinforcement learning
    Tian, Yingqiang
    Wan, Haiying
    Karimi, Hamid Reza
    Luan, Xiaoli
    Liu, Fei
    [J]. NEUROCOMPUTING, 2024, 609
  • [8] Model-free adaptive control design for nonlinear discrete-time processes with reinforcement learning techniques
    Liu, Dong
    Yang, Guang-Hong
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2018, 49 (11) : 2298 - 2308
  • [9] Model-Free Reinforcement Learning for Fully Cooperative Multi-Agent Graphical Games
    Zhang, Qichao
    Zhao, Dongbin
    Lewis, Frank L.
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [10] Model-Free Learning Control of Nonlinear Discrete-Time Systems
    Sadegh, Nader
    [J]. 2011 AMERICAN CONTROL CONFERENCE, 2011, : 3553 - 3558